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Abstract. In this work, we consider a solution of automata similar to Population Protocols and Net¬ 
work Constructors. The automata (also called nodes) move passively in a well-mixed solution without 
being capable of controlling their movement. However, the nodes can cooperate by interacting in pairs. 
Every such interaction may result in an update of the local states of the nodes. Additionally, the nodes 
may also choose to connect to each other in order to start forming some required structure. We may 
think of such nodes as the smallest possible programmable pieces of matter, like tiny nanorobots or 
programmable molecules. The model that we introduce here is a more applied version of Network Con¬ 
structors, imposing physical (or geometrical ) constraints on the connections that the nodes are allowed 
to form. Each node can connect to other nodes only via a very limited number of local ports, which 
implies that at any given time it has only a bounded number of neighbors. Connections are always made 
at unit distance and are perpendicular to connections of neighboring ports. Though such a model cannot 
form abstract networks like Network Constructors, it is still capable of forming very practical 2D or 3D 
shapes. We provide direct constructors for some basic shape construction problems, like spanning line, 
spanning square, and self-replication. We then develop new techniques for determining the computa¬ 
tional and constructive capabilities of our model. One of the main novelties of our approach, concerns 
our attempt to overcome the inability of such systems to detect termination. In particular, we exploit 
the assumptions that the system is well-mixed and has a unique leader, in order to give terminating 
protocols that are correct with high probability. This allows us to develop terminating subroutines that 
can be sequentially composed to form larger modular protocols (which has not been the case in the 
relevant literature). One of our main results is a terminating protocol counting the size n of the system 
with high probability. We then use this protocol as a subroutine in order to develop our universal 
constructors, establishing that it is possible for the nodes to become self-organized with high probability 
into arbitrarily complex shapes while still detecting termination of the construction. 


Keywords: distributed network construction, programmable matter, shape formation, well-mixed 
solution, homogeneous population, distributed protocol, interacting automata, fairness, random 
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1 Introduction 

Recent research in distributed computing theory and practice is taking its first timid steps on the 
pioneering endeavor of investigating the possible relationships of distributed computing systems to 
physical and biological systems. The first main motivation for this is the fact that a wide range of 
physical and biological systems are governed by underlying laws that are essentially algorithmic. 
The second is that the higher-level physical or behavioral properties of such systems are usually the 
outcome of the coexistence, which may include both cooperation and competition, and constant 

* Supported in part by the project “Foundations of Dynamic Distributed Computing Systems” (FOCUS) which is 
implemented under the “ARISTEIA” Action of the Operational Programme “Education and Lifelong Learning” 
and is co-funded by the European Union (European Social Fund) and Greek National Resources. 



interaction of very large numbers of relatively simple distributed entities respecting such laws. This 
effort, to the extent that its perspective allows, is expected to promote our understanding on the 
algorithmic aspects of our (distributed) natural world and to develop innovative artificial systems 
inspired by them. 


Ulam’s and von Neuman’s Cellular Automata (cf. e.g. ISM), essentially a distributed grid 
network of automata, have been used as models for self-replication, for modeling several physical 
systems (e.g. neural activity, bacterial growth, pattern formation in nature), and for understand¬ 
ing emergence, complexity, and self-organization issues. Population Protocols of Angluin et al. 
AAD + 06 were originally motivated by highly dynamic networks of simple sensor nodes that can¬ 


not control their mobility. Recently, Doty ( Dot 14] demonstrated their formal equivalence to chemical 
reaction networks (CRNs), which model chemistry in a well-mixed solution. Moreover, the Network 
Constructors extension of population protocols [MS 14] , showed that a population of finite-automata 
that interact randomly like molecules in a well-mixed solution and that can establish bonds with 
each other according to the rules of a common small protocol, can construct arbitrarily complex sta¬ 
ble networks |MS14j . In the young area of DNA self-assembly it has been already demonstrated that 
it is possible to fold long, single-stranded DNA molecules into arbitrary nanoscale two-dimensional 
shapes and patterns |R,otf)6| . Recently, an interesting theoretical model was proposed, the Nubot 
model, for studying the complexity of self-assembled structures with active molecular components 
WCG + 13]. This model is inspired by biology’s fantastic ability to assemble biomolecules that form 


systems with complicated structure and dynamics, from molecular motors that walk on rigid tracks 
and proteins that dynamically alter the structure of the cell during mitosis, to embryonic devel¬ 
opment where large-scale complicated organisms efficiently grow from a single cell. Also recently 
a system was reported that demonstrates programmable self-assembly of complex two-dimensional 
shapes with a thousand-robot swarm, called the Kilobot [RCN14j . This was enabled by creating 
small, cheap, and simple autonomous robots designed to operate in large groups and to cooperate 
through local interactions and by developing a collective algorithm for shape formation that is 
highly robust to the variability and error characteristic of large-scale decentralized systems. 


The established and ongoing research seems to have opened the road towards a vision that will 
further reshape society to an unprecedented degree. This vision concerns our ability to manipulate 
matter via information-theoretic and computing mechanisms and principles. It will be the jump 
from amorphous information to the incorporation of information to the physical world. Information 
will not only be part of the physical environment: it will constantly interact with the surrounding 
environment and will have the ability to reshape it. Matter will become programmable [GCM05] 
which is a plausible future outcome of progress in high-volume nanoscale assembly that makes it 
feasible to inexpensively produce millimeter-scale units that integrate computing, sensing, actua¬ 
tion, and locomotion mechanisms. This will enable the astonishing possibility of transferring the 
discrete dynamics from the computer memory black-box to the real world and to achieve a physical 
realization of any computer-generated object. It will have profound implications for how we think 
about chemistry and materials. Materials will become user-programmed and smart, adapting to 
changing conditions in order to maintain, optimize or even create a whole new functionality using 
means that are intrinsic to the material itself. It will even change the way we think about engineer¬ 
ing and manufacturing. We will for the first time be capable of building smart machines that adapt 
to their surroundings, such as an airplane wing that adjusts its surface properties in reaction to 
environmental variables |Zak07] , or even further realize machines that can self-built autonomously. 
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1.1 Our Approach 


We imagine here a “solution” of automata (also called nodes or processes throughout the paper), 
a setting similar to that of Population Protocols and Network Constructors. Due to its highly 
restricted computational nature and its very local perspective, each individual automaton can 
practically achieve nothing on its own. However, when many of them cooperate, each contributing its 
meager computational capabilities, impressive global outcomes become feasible. This is, for example, 
the case in the Kilobot system, where each individual robot is a remarkably simple artifact that 
can perform only primitive locomotion via a simple vibration mechanism. Still, when a thousand of 
them work together, their global dynamics and outcome resemble the complex functions of living 
organisms. From our perspective, cooperation involves the capability of the nodes to communicate 
by interacting in pairs and to bind to each other in an algorithmically controlled way. In particular, 
during an interaction, the nodes can update their local states according to a small common program 
that is stored in their memories and may also choose to connect to each other in order to start 
forming some required structure. Later on, if needed, they may choose to drop their connection, e.g. 
for rearrangement purposes. We may think of such nodes as the smallest possible programmable 
pieces of matter. For example, they could be tiny nanorobots or programmable molecules (e.g. 
DNA strands). Naturally, such elementary entities are not (yet) expected to be equipped with 
some internal mobility mechanism. Still, it is reasonable to expect that they could be part of some 
dynamic environment, like a boiling liquid or the human circulatory system, providing an external 
(to the nodes) interaction mechanism. This, together with the fact that the dynamics of such models 
have been recently shown to be equivalent to those of CRNs, motivate the idea of regarding such 
systems as a solution of programmable entities. We model such an environment by imagining an 
adversary scheduler operating in discrete steps and selecting in every step a pair of nodes to interact 
with each other. 

Our main focus in this work, building upon the findings of {MSI4] . is to further investigate the 
cooperative structure formation capabilities of such systems. Our first main goal is to introduce a 
more realistic and more applicable version of network constructors by adjusting some of the abstract 
parameters of the model of j.MS 1 lb In particular, we introduce some physical (or geometrical) 
constraints on the connections that the processes are allowed to form. In the network constructors 
model of |MS14j . there were no such imposed restrictions, in the sense that, at any given step, 
any two processes were candidates for an interaction, independently of their relative positioning 
in the existing structure/network. For example, even two nodes hidden in the middle of distinct 
dense components could interact and, additionally, there was no constraint on the number of active 
connections that a node could form (could be up to the order of the system). This was very 
convenient for studying the capability of such systems to self-organize into abstract networks and 
it helped show that arbitrarily complex networks are in principle constructible. On the other hand, 
this is not expected to be the actual mechanism of at least the first potential implementations. 
First implementations will most probably be characterized by physical and geometrical constraints. 
To capture this in our model, we assume that each device can connect to other devices only via a 
very limited (finite and independent of the size of the system) number of ports, usually four or six, 
which implies that, at any given time, a device has only a bounded number of neighbors. Moreover, 
we further restrict the connections to be always made at unit distance and to be perpendicular to 
connections of neighboring ports. Though such a model can no longer form abstract networks, it 
may still be capable of forming very practical 2-dimensional or 3-dimensional shapes. This is also 
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in agreement with natural systems, where the complexity and physical properties of a system are 
rarely the result of an unrestricted interconnection between entities. 


It can be immediately observed that the universal constructors of [MS14] do not apply in this 
case. In particular, those constructors cannot be adopted in order to characterize the constructive 
power of the model considered here. The reason is that they work by arranging the nodes in a long 
line and then exploiting the fact that connections are elastic and allow any pair of nodes of the line 
to interact independently of the distance between them. In contrast, no elasticity is allowed in the 
more local model considered here, where a long line can still be formed but only adjacent nodes 
of the line are allowed to interact with each other. As a result, we have to develop new techniques 
for determining the computational and constructive capabilities of our model. The other main 
novelty of our approach, concerns our attempt to overcome the inability of such systems to detect 
termination due to their limited global knowledge and their limited computational resources. For 
example, it can be easily shown that deterministic termination of population protocols can fail 
even in determining whether there is a single a in an input assignment, mainly because the nodes 
do not know and cannot store in their memories neither the size of the network nor some upper 
bound on the time it takes to meet (or to influence or to be influenced by) every other node. To 
overcome the storage issue, we exploit the ability of nodes to self-assemble into larger structures 
that can then be used as distributed memories of any desired length. Moreover, we exploit the 
common (and natural in several cases) assumption that the system is well-mixed, meaning that, 
at any given time, all permissible pairs of node-ports have an equal probability to interact, in 
order to give terminating protocols that are correct with high probability. This is crucial not only 
because it allows to improve eventual stabilization to eventual termination but, most importantly, 
because it allows to develop terminating subroutines that can be sequentially composed to form 
larger modular protocols. Such protocols are more efficient, more natural, and more amenable 
to clear proofs of correctness, compared to existing protocols that are based on composing all 
subroutines in parallel and “sequentializing” them eventually by perpetual reinitializations. To the 
best of our knowledge, fMCS12| is the only work that has considered this issue but with totally 
different and more deterministic assumptions. Several other papers (AAD + 06i IAAE081IMS14] have 
already exploited a uniform random interaction model, but in all cases for analyzing the expected 
time to convergence of stabilizing protocols and not for maximizing the correctness probability of 
terminating protocols, as we do here. 


In Section [2j we discuss further related literature. Section [3] formally defines the model under 
consideration and brings together all definitions and basic facts that are used throughout the paper. 
In Section[4j we provide direct (stabilizing) constructors for some basic shape construction problems. 
Section [5] introduces our technique for counting the size n of the system with high probability. The 
result of that section (i.e. Theorem [lj is of particular importance as it underlies all sequential 
composition arguments that follow in the paper. In particular, the protocol of Section [5] is used as 
a subroutine in our universal constructors, presented in Section [6j establishing that it is possible to 
construct with high probability arbitrarily complex shapes (and patterns) by terminating protocols. 
These universality results are discussed in Section [6j In Section [7] we consider the problem of 
shape self-replication. Finally, in Section [8] we conclude and give further research directions that 
are opened by our work. 
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2 Further Related Work 


Population Protocols. Our model for shape construction is strongly inspired by the Population 
Protocol model AAD + 06] and the Mediated Population Protocol model f.MCSl lal . In the former, 
connections do not have states. States on the connections were first introduced in the latter. The 
main difference to our model is that in those models the focus was on the computation of functions 
of some input values and not on network construction. Another important difference is that we 
allow the edges to choose between only two possible states which was not the case in jMCSlla] . 
Interestingly, when operating under a uniform random scheduler, population protocols are formally 
equivalent to chemical reaction networks (CRNs) which model chemistry in a well-mixed solution 
|Dotl4| . CRNs are widely used to describe information processing occurring in natural cellular 
regulatory networks, and with upcoming advances in synthetic biology, CRNs are a promising 
programming language for the design of artificial molecular control circuitry. However, CRNs and 
population protocols can only capture the dynamics of molecular counts and not of structure 
formation. Our model then may also be viewed as an extension of population protocols and 
CRNs aiming to capture the stable structures that may occur in a well-mixed solution. From this 
perspective, our goal is to determine what stable structures can result in such systems (natural 
or artificial), how fast, and under what conditions (e.g. by what underlying codes/reaction-rules). 
Most computability issues in the area of population protocols have now been resolved. Finite-state 
processes on a complete interaction network, i.e. one in which every pair of processes may interact, 
(and several variations) compute the semilinear predicates ]AAER07] . Semilinearity persists up to 
o(loglogn) local space but not more than this |CMN + llj . If additionally the connections between 
processes can hold a state from a finite domain (note that this is a stronger requirement than the 
on/off that the present work assumes) then the computational power dramatically increases to the 
commutative subclass of NSPACE(?z 2 ) jMCSlla] . Other important works include jGR09j which 
equipped the nodes of population protocols with unique ids and [BBCKlU] which introduced a 
(weak) notion of speed of the nodes that allowed the design of fast converging protocols with only 
weak requirements. For a introductory texts see [ AR071 iMCSllbj . 


Algorithmic Self-Assembly. There are already several models trying to capture the self-assembly 
capability of natural processes with the purpose of engineering systems and developing algorithms 
inspired by such processes. For example, |Dotl2| proposes to learn how to program molecules to 
manipulate themselves, grow into machines and at the same time control their own growth. The 
research area of “algorithmic self-assembly” belongs to the field of “molecular computing”. The 
latter was initiated by Adleman (Adl94j . who designed interacting DNA molecules to solve an 
instance of the Hamiltonian path problem. The model that has guided the study in algorithmic 
self-assembly is the Abstract Tile Assembly Model (aTAM) jWin98i IRWOOj and variations. 
Recently, the Nubot model was proposed (WCG + 13j . which was another important influence 
for our work. That model aims at motivating engineering of molecular structures that have 
complicated active dynamics of the kind seen in living biomolecular systems. It tries to capture 
the interplay between molecular structure and dynamics. Simple molecular components form 
assemblies that can grow (exponentially fast, by successive doublings) and shrink, and individual 
components undergo state changes and move relative to each other. The main result of jWCG + 13] 
was that any computable shape of size < n x n can be built in time polylogarithmic in n, plus 
roughly the time needed to simulate a TM that computes whether or not a given pixel is in the 
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final shape. 


Distributed Network Construction. To the best of our knowledge, classical distributed 
computing has not considered the problem of constructing an actual communication network 
from scratch. From the seminal work of Angluin |Ang80| that initiated the theoretical study of 
distributed computing systems up to now, the focus has been more on assuming a given commu¬ 
nication topology and constructing a virtual network over it, e.g. a spanning tree for the purpose 
of fast dissemination of information. Moreover, these models assume most of the time unique 
identities, unbounded memories, and message-passing communication. Additionally, a process 
always communicates with its neighboring processes (see |Lyn96| for all the details). An exception 
is the area of geometric pattern formation by mobile robots (cf. |SY99l IDFSY10] and references 
therein). A great difference, though, to our model is that in mobile robotics the computational 
entities have complete control over their mobility and thus over their future interactions. That is, 
the goal of a protocol is to result in a desired interaction pattern while in our model the goal of 
a protocol is to construct a structure while operating under a totally unpredictable interaction 
pattern. Very recently, the Amoebot model, a model inspired by the behavior of amoeba that allows 
algorithmic research on self-organizing particle systems and programmable matter, was proposed 
jDCRSl.T; DDG + 14], The goal is for the particles to self-organize in order to adapt to a desired 
shape without any central control, which is quite similar to our objective, however the two models 
have little in common. In the same work, the authors observe that, in contrast to the considerable 
work that has been performed w.r.t. to systems (e.g. self-reconhgurable robotic systems), only 
very little theoretical work has been done in this area. This further supports the importance of 
introducing a simple, yet sufficiently generic, models for distributed shape construction, as we do 
in this work. 


Network Formation in Nature. Nature has an intrinsic ability to form complex structures 
and networks via a process known as self-assembly. By self-assembly, small components (like e.g. 
molecules) automatically assemble into large, and usually complex structures (like e.g. a crystal). 
There is an abundance of such examples in the physical world. Lipid molecules form a cell’s mem¬ 
brane, ribosomal proteins and RNA coalesce into functional ribosomes, and bacteriophage virus 
proteins self-assemble a capsid that allows the virus to invade bacteria [Dotl2j . Mixtures of RNA 
fragments that self-assemble into self-replicating ribozymes spontaneously form cooperative cat¬ 
alytic cycles and networks. Such cooperative networks grow faster than selfish autocatalytic cycles 
indicating an intrinsic ability of RNA populations to evolve greater complexity through cooperation 
(VMC + 12] . Through billions of years of prebiotic molecular selection and evolution, nature has pro¬ 
duced a basic set of molecules. By combining these simple elements, natural processes are capable 
of fashioning an enormously diverse range of fabrication units, which can further self-organize into 
refined structures, materials and molecular machines that not only have high precision, flexibility 
and error-correction capacity, but are also self-sustaining and evolving. In fact, nature shows a 
strong preference for bottom-up design. 

Systems and solutions inspired by nature have often turned out to be extremely practical and 
efficient. For example, the bottom-up approach of nature inspires the fabrication of biomaterials by 
attempting to mimic these phenomena with the aim of creating new and varied structures with novel 
utilities well beyond the gifts of nature }Zha03] . Moreover, there is already a remarkable amount 
of work envisioning our future ability to engineer computing and robotic systems by manipulating 
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molecules with nanoscale precision. Ambitious long-term applications include molecular computers 
BPS + 1C)] and miniature (nano)robots for surgical instrumentation, diagnosis and drug delivery in 
medical applications (e.g. it has very recently been reported that DNA nanorobots could even kill 
cancer cells jDBC12| ) and monitoring in extreme conditions (e.g. in toxic environments). However, 
the road towards this vision passes first through our ability to discover the laws governing the 
capability of distributed systems to construct networks. The gain of developing such a theory will 
be twofold: It will give some insight to the role (and the mechanisms) of network formation in the 
complexity of natural processes and it will allow us engineer artificial systems that achieve this 
complexity. 

3 A Model of Network Constructors 

There are n nodes. Every node is a finite-state machine taking states from a finite set Q. Addition¬ 
ally, every node has a bounded number of ports which it uses to interact with other nodes. In the 
2-dimensional (2D) case, there are four ports p y , p x , p- y , and p- x . For notational convenience and 
to improve readability we almost exclusively use u,r,d,l (for up, right, down, and left, respectively) 
in place of p y , p x , p~ y , p~ x , respectively. Similarly, in the 3-dimensional (3D) case there are 6 ports 
Py, Pz, Px, p-y, P-Z, and p- x (see Figure 0- Neighboring ports are perpendicular to each other, 
forming local axes. For example, in the 2-dimensional case, p y _L p x , p x _L p~ y , p~ y _L p~ x , and 
P- x _L p y . Similar assumptions hold for the 3-dimensional case. We can imagine the nodes moving 
passively in a well-mixed solution. An important remark is that the above coordinates are only for 
local purposes and do not necessarily represent the actual orientation of a node in the system. A 
node may be arbitrarily rotated so that, for example, its x local coordinate is aligned with the y 
real coordinate of the system or it is not aligned with any real coordinate. We assume that nodes 
may interact in pairs, whenever a port of one node is at unit distance and in straight line (w.r.t. to 
the local axes) from a port of another node. For example, it could be the case that, at some point 
during execution, the axis of the p y port of a node u becomes aligned with the axis of the p- x port 
of another node v and the distance between them is one unit. Then u and v interact and, apart 
from updating their local states, they can also activate the connection between their corresponding 
ports. Later on, they can again deactivate the connection if they want. 

Definition 1. A 2D (or 3D) protocol is defined by a 4-tuple (Q,qo,Q 0 ut,b), where Q is a finite set 
of node-states, qo 6 Q is the initial node-state, Q ou t C Q is the set of output node-states, and 
6 : (Q x P) x (Q x P) x {0,1} — > Q x Q x {0,1} is the transition function, where P = {u, r, d, 1} 
(or P = {py,p z ,p x ,p- y ,p- z ,p- x }, resp.) is the set of ports. When required, also a special initial 
leader-state Lq E Q may be defined. 

If 5((a,pi), (b,p 2 ), c) = ( a',b',c'), we call (a,pi),(b,p 2 ),c —> ( a',b',c' ) a transition (or rule). A 
transition (a,p±), (b,p 2 ), c —> (a', b', c') is called effective if x f x / for at least one x E {a, b, c} and 
ineffective otherwise. When we present the transition function of a protocol we only present the 
effective transitions. Additionally, we agree that the size of a protocol is the number of its states, 
i.e. \Q\. 

The system consists of a population V of n distributed processes, called nodes when clear from 
context. Execution of the protocol proceeds in discrete steps. In every step, an unordered (cf. |MS141 
for more details and formalism about unordered interactions) pair of node-ports (y\,pi)(y 2 ,P 2 ) is 
selected by an adversary scheduler and these nodes interact via the corresponding ports and update 
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Fig. 1. The top figure depicts the 2D version of the model. Each node has four ports and consecutive ports are 
perpendicular to each other. Two nodes are interacting, the left one via its p x port and the right one via its p~ x port. 
The interaction can occur because the distance between the nodes is unit and the corresponding ports are totally 
aligned (in a straight line). The bottom figure depicts the 3D version of the model. The only difference is an extra z 
dimension. 


their states and the state of the edge joining them according to the transition function 5. Before 
every step t > 1, there is a shape configuration G(t) = ( V,E(t )) where Eft) C {(vi,pi)(v 2 ,P 2 ) '■ 
vi,V 2 E V andpi,p2 E P}. (vi,pi)(v 2 ,P 2 ) E Eft) means that before the tth selection of the 
scheduler the p\ port of node v\ is connected by an active edge to the P 2 port of node V 2 ■ Observe 
that not all possible Eft) are valid given the restrictions that connections are made at unit distance 
and are perpendicular whenever they correspond to consecutive ports of a node. For example, if 
(vi,r)(v 2 ,l) E E(t) then (v\,l)(v 2 ,r) ^ Eft). In general, Eft) is valid (or feasible) if any component 
defined by it is a subnetwork of the 2-dimensional grid network with unit distances (depicted e.g. 
in Figure [7 (a) | on page 30). From now on, we call a 2D (3D) shape any connected subnetwork of 
the 2D (3D) grid network with unit distances. 


The shapes of Eft) also determine the possible selections of the scheduler at step t. In particular, 
(vi,Pi)(v2,P2) can be selected for interaction at step t iff after aligning port p\ of v\ and p 2 of V 2 at 
unit distance from each other and vertically to the neighboring ports, the component that would 
result by activating the connection (that component is the union of the shape of v\ and the shape 
of V 2 ) is a shape. In particular, there should be no two nodes in the union (one from the first shape 
and one from the other) that occupy the same position of the grid network (i.e. that one falls over 
the other). For a simple example, imagine a l_ shape consisting of three nodes and a vertical line 
| consisting of two nodes. The bottom node of the line is allowed to occupy the missing corner of 
the l_ while, on the other hand, the upper node of the line is not allowed (unless rotated) because 
this would result in the lower node of the line falling over the right node of the L. Though in 
principle a connected component could operate autonomously internally without having to wait for 
the scheduler to pick a pair of connected nodes to interact, throughout this work, for simplicity and 
uniformity of the arguments, we also restrict interactions between connected pairs to be controlled 
only by the scheduler. Notice that any port pair that is connected by an active edge before step t 
may be selected by the scheduler at step t. 
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A configuration C is a pair (Cv,Ce), where Gy : V —> Q specifies the state of each node 
and Ce ■ {(vi,Pi)(v 2 ,P 2 ) ■ iq,u 2 E V andpi,p 2 E P} —> {0,1} specifies the state (inactive or 
active) of every possible port pair. Observe that if Ce is the edge configuration before step t, then 
E(t) = = {rir 2 : Ce{v 1 ^ 2 ) = 1}- We write C —> C' if C' is reachable in one step from 

C (meaning via a single interaction that is permitted on C). We say that C is reachable from C 
and write C C’, if there is a sequence of configurations C = Cq,C\, ... ,Ct = C 1 , such that 

C % —> Ci- (_i for all i, ()<i< t. An execution is a finite or infinite sequence of configurations Cq. C\, 
C 2 , ■ ■., where Cq is an initial configuration and Ci —> C}+ 1 , for all i > 0. A fairness condition is 
imposed on the adversary to ensure the protocol makes progress. An infinite execution is fair if 
for every pair of configurations C and C' such that C —> C', if C occurs infinitely often in the 
execution then so does C. In most cases, we assume that interactions are chosen by a uniform 
random scheduler which, in every step t, selects independently and uniformly at random one of the 
interactions permitted by E(t). Note that the uniform random scheduler is fair with probability 
1. Wherever no such probabilistic scheduling assumption is made, every execution of a protocol 
will by definition considered to be fair. Random schedulers are particularly useful when one wants 
to analyze the running time of protocols. In this work, we mainly exploit them in order to devise 
terminating protocols that are correct with high probability (abbreviated w.h.p.), always meaning 
in this work with probability at least 1 — l/n c for some constant c > 1. 

We define the output of a configuration C as the set of shapes G(C) = (V s , E s ) where V s = 
{u E V : C v (u ) E Qout} and E s = Cfi}[ 1] n {(ui,pi)(u 2 ,p 2 ) : v 1 ,v 2 E V s andpi,p 2 E P}. In 
words, the output shapes of a configuration consist of those nodes that are in output states and 
those edges between them that are active. Throughout this work, we are interested in obtaining 
a single shape as the final output of the protocol (see, for an example, the black nodes and the 
connections between them in Figure [7(d) | on page 30). As already mentioned, our main focus will 
be on terminating protocols. In this case, we assume a set Qhait Q Q hr place of Qout- The only 
difference is that for all qhait E Qhait, every rule containing qhait is ineffective. In contrast, states in 
Qout may have effective interactions which we guarantee (by design) to cease eventually resulting 
in the stabilization of the final shape. 


Definition 2. We say that an execution of a protocol on n processes constructs (stably constructs) 
a shape G, if it terminates (stabilizes, resp.) with output G. 

Every 2D shape G has a unique minimum 2D rectangle Rq enclosing it. Rq is a shape with 
its nodes labeled from {0,1}. The nodes G are labeled 1, the nodes in V(Rg)\V(G) are labeled 0, 
and all edges are active. It is like filling G with additional nodes and edges to make it a rectangle. 
In fact, as we shall see in Section [Tj, Rq can also be constructed by a protocol, given G. The 
dimensions of Rq are defined by h,Q, which is the horizontal distance between a leftmost node 
and a rightmost node of the shape (x-dimension), and vq, which is the vertical distance between 
a highest and a lowest node of the shape (y-dimension). Let also max-dime ■= max{/ic, '^6’} and 
min_dimc '■= min{/iG, vq}- Then Rq can be extended by max-dime — minwlimc extra rows or 
columns, depending on which of its dimensions is smaller, to a max-dima x max dime square Sq 
enclosing G (we mean here a {0, l}-labeled square, as above, in which G can be identified). Observe, 
that such a square is not unique. For example, if G is a horizontal line of length d (i.e. he = d 
and vq = 1) then it is already equal to Rq and has to be extended by d — 1 rows to become Sq- 
These rows can be placed in d distinct ways relative to G, but all these squares have the same size 
max.dime x max.dime denoted by |S'g|- 


9 





A (j-dimensional) shape language £ is a subset of the set of all possible (j -dimensional) shapes. 
We restrict our attention here to shape languages that contain a unique shape for each possible 
maximum dimension of the shape. In this case, it is equivalent, and more convenient, to translate 
£ to a language of labeled squares. In particular, we define in this work a shape language £ by 
providing for every d > la single d x d square with its nodes labeled from {0,1}. Such a square 
may also be defined by a d 1 2 -sequence Sd = (so, si,..., s^-i) of bits or pixels , where Sj G {0,1} 
corresponds to the j-th node as follows: We assume that the pixels are indexed in a “zig-zag” 
fashion, beginning from the bottom left corner, moving to the right until the bottom right corner 
is encountered, then one step up, then to the left until the node above the bottom left corner is 
encountered, then one step up again, then right, and so on (see the directed path in Figure [7(b)| on 
page 30). The shape Gd defined by Sd, called the shape of Sd, is the one induced by the nodes of 
the square that are labeled 1 and throughout this work we assume that max.dimc d = d. 

For simulation purposes, we also need to introduce appropriate shape-constructing Turing Ma¬ 
chines (TMs). We now describe such a TM M: M’s goal is to construct a shape on the pixels of a 
y/n x yfn square, which are indexed in the zig-zag way described above. M takes as input an integer 
i G {0,1,..., n — 1} and the size n or the dimension y/n of the square (all in binary) and decides 
whether pixel i should belong or not to the final shape, i.e. if it should be on or off. , respectively. 0 
Moreover, in accordance to our definition of a shape, the construction of the TM, consisting of the 
pixels that M accepts (as on) and the active connections between them, should be connected (i.e. 
it should be a single shape). 


Definition 3. We say that a shape language £ = (S±, S 2 , S 3 ,...) is TM-computable or TM- 
constructible in space f(d), if there exists a TM M (as defined above) such that, for every d > 1, 
when M is executed on the pixels of a d x d square results in Sd (in particular, on input ( i,d), 
where 0 < i < d? — 1 , M gives output Sd[i)), by using space 0(f(d)) in every execution. 

Definition 4. We say that a protocol A constructs a shape language £ with useful space g{n) < n, 
if g(n) is the greatest function for which: (i) for all n, every execution of A on n processes constructs 
a shape G G £ [^] of order at least g(n ) (provided that such a G exists) and, additionally, (ii) for 
all G G £ there is an execution of A on n processes, for some n satisfying |V(G)| > g(n), that 
constructs G. Equivalently, we say that A constructs £ with waste n — g(n). 


4 Basic Constructions 

We give in this section protocols for two very basic shape construction problems, the spanning 
line problem and the spanning square problem. Both constructions are very useful because they 
organize the nodes in a way that is convenient for TM simulations that exploit the whole distributed 

1 If the TM is not provided with the square size together with the pixel, then it can only compute uniform/symmetric 
shapes that are independent of n. Such a shape could for example be one that has every even pixel on and every 
odd pixel off. But such shapes rarely satisfy the connectivity condition. For example, it is not clear how to activate 
all the leftmost pixels of the square by a uniform TM, because such a TM should somehow guess that pixel 2y/n — 1 
should be accepted without knowing n and given that all pixels in [1, 2 y/n — 2] must be rejected. So, it seems more 
natural to consider TMs that apart from the pixel index are also provided with n or y/n (if the latter is more 
convenient) in binary. Now, it is straightforward how to resolve the acceptance of only the leftmost pixels of the 
square. The TM every time accepts the input-pixel i iff i = 2 ky/n — 1, for some k > 1, or * = 2 ky/n, for some 
k > 0. Observe that 2 ky/n can always be computed because the TM is also provided with y/n in its input. 

2 G is the shape of a labeled square S' £ £ in case C is defined in terms of such squares. 
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memory as a tape. Keep in mind that the protocols of this section are stabilizing (that is eventually 
the output shape stops changing) and not terminating. Our technique that allows for terminating 
constructions will be introduced in Section [5j 

4.1 Global Line 

We begin by presenting a protocol for the spanning line problem. Assume that there is initially a 
unique leader in state L r (we typically use ‘L’ for the states of a leader to distinguish from the left 
port T) and all other nodes are in state qo- A protocol that constructs a spanning line is described 
by the effective rules ( Li,i ), (qo. j), 0 — > (q\, Lj, 1) for all i,j £ {u,r,d,l} where j denotes the port 
opposite to port j. In words, initially the leader waits to meet a qo via its right port. Assume that 
it meets port j of a go- Then the connection between them becomes activated and the leader takes 
the place of the qo leaving behind a q\ . Moreover, the new leader is now in state Lj indicating that 
it is now waiting to expand the line towards the port that is opposite to the one that is already 
active, which guarantees that a straight line will be formed. We could even have a simplified version 
of the form (L, r), (qo, Z), 0 —>• (qi, L, 1). This is a little slower, because now an effective interaction, 
and a resulting expansion of the line, only occurs when the r port of the leader meets the l port of 
a qo. 


4.2 y/n X y/n Square 

We now give two protocols for the spanning square problem. We assume for simplicity that the 
square root of n is integer. We again begin from the case where there is a preelected unique leader 
in state L u and all other nodes are initially in state qo- 


Protocol 1 Square 

Q ~~ {L u , L r , L d , Li , qo, qi} 

S- 

(L u ,u), (qo,d),0 ->■ (qi,L r , 1) 

(L r ,r), {qo,l),0 (qi,L d , 1) 

( L d ,d ), (qo,u),0 —>■ [qi,Li, 1) 

(Li,l),{qo,r),0 ->• (qi,L u ,l) 

(L u , u), ( qi,d ), 0 —> ( Li,qi , 1) 

(L r ,r), (qi, l), 0 —¥ ( L u ,q i, 1) 

{L d ,d), (qi,u),0 ->■ {L r ,qi, 1) 

(Li,l),{qi,r),0 -s- (L d ,q i, 1) 

// All transitions that do not appear have no effect 


We now describe the idea that Protocol [l] implements. The protocol first constructs a 2 x 2 
square. When it is done, the leader is at the bottom-right corner and is in state Ld- This can only 
cause the attachment of a free qo from below. When this occurs, the leader moves on the new node 
and tries to move to the left. This will occur by the attachment of another free node from the left 
this time. When this occurs, the leader moves on the new node and tries to move up. But this 
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time the up movement cannot succeed because the leader is below the bottom-left corner of the 
square. Instead the leader activates the connection with that corner and tries to move another step 
left. When it succeeds, it tries again to move up, which now can occur because it has now moved 
outside the left boundary of the 2x2 square. In general, whenever the leader is at the left (the 
up, right, and down cases are symmetric) of the already constructed square it tries to move to the 
right in order to walk above the square. If it does not succeed, it is because it has not yet moved 
over the upper boundary, so it activates the edge to the right, takes another step up and then tries 
again to move to the right. In this way, the leader always grows the square perimetrically and in 
the clockwise direction. 

We next use turning marks to simplify and speed up the turning process. The unique leader 
begins in state L% Now, instead of always trying to turn, the leader turns only when it meets 
special marks left by the previous phase near the corners of the square. When it meets such a 
mark, the leader introduces the new corner and a new mark adjacent to that corner to be found 
during the next phase, and then makes a turn (see Figure [ 2 ]). A difference to the previous protocol 
is that now several of the nodes of the new perimeter may remain disconnected for a while from 
their internal neighbors (i.e. those belonging to the internal perimeter constructed in the previous 
phase). However, rules of the form (qi,i), (qi, i), 0 —> (qi,qi, 1) guarantee that these nodes eventually 
become connected. A disadvantage of this approach is that the structure may be less “rigid” than 
the previous one as long as several (q \, q\) connections are not yet established. The protocol is 
formally presented in Protocol [2] 


o 
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1st Phase u 2nd Phase 

Fig. 2. The first two phases of Protocol[2] Gray nodes indicate the starting point of each phase. Edge labels indicate 
the order by which the square grew during the phase. The nodes labeled L en d are the points at which each of the 
phases ends. The unlabeled solid edges of Phase 2 indicate the shape that preexisted from Phase 1. The nodes 
attached at “times” 1, 3, 5, 7 of Phase 1 and 11,15,19, 23 of Phase 2 are the turning marks that will be exploited for 
easier turning by the leader in the subsequent phase. Dotted edges are edges that have not be activated yet but will 
for sure be activated eventually resulting in a more “rigid” structure. 


The unique leader assumption is in all the above cases not necessary. 
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Protocol 2 Square2 

Q = {Li, L 2 i, Li, Li, L end , qo, qi}, for all i £ {u, r, d, 1} 

S: 

(L d , d), (qo, u), 0 y (L u , q\, 1) 

(L?,l), {q o ,r),0-> (Ll,qi, 1) 

( L u , u ), ( qo, d), 0 y ( L d , q i , 1) 

(Ll,r),(q o ,l),0-> (L end ,qi,l) 

{Liu), (q o ,d),0-y (qi,L 2 , 1) 

(Ll,r),(q 0 ,l),0-> {qi,L 2 u ,l) 

{Ld, d), {qo, u), 0 y (gi , L r , 1) 

{Ll,u), (q o ,d),0-y (qi,L?, 1) 

(Lend, d), (q 0 , u), 0 -*■ (qi,Li, 1) 

(L h l), (go,r), 0 —y (qi,L t , 1) 

(L u l), (gi,r),0 —y (qi,L?, 1) 

( L u ,u ), (g 0 , d), 0 -y (qi,L u , 1) 

(Lu,u), (qi,d),0^y (qi,L 3 u , 1) 

(L r ,r), (qo, l), 0 —y (qi,L r ,l) 

(L r ,r), (gi, l), 0 —y (gi,L* 1) 

(L d , d), (q 0 , u), 0 —y (qi,L d , 1) 

(L d , d), (gi, m), 0 —y (qi,L 3 d ,l) 

(L 3 , l), (qo, r), 0 —» (gi, L d , 1) 

(L 3 u ,u), (q 0 , d), 0 —y (qi,Lf, 1) 

(L 3 ,r),(q o ,l),0 -y (gi, 1/^,1) 

(■ L d ,d ), (go, u), 0 -y (qi,L*, 1) 

(L d , d), (qo, u ), 0 y (L u ,qi, 1) 

(Li,l), (go, r), 0 -y (L r ,qi, 1) 

(Lt,u), (qo, d), 0 -y (L d , gi, 1) 

(Lt,r), (qo, l), 0 (L e „d,gi,l) 

(qi,i), (qi,i), 0 —y (qi,qi, 1) for all i £ (it, r, d, £}, where i denotes the opposite port of i 
(L u ,r), (gi, l), 0 —y (L u ,q u l) 

(L r , d), (qi,u), 0 —y (L r , gi, 1) 

(L d ,l), (qi,r), 0 —y (L d , gi, 1) 

(■ Li,u ), (gi, d), 0 —y (L h qi , 1) 
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5 Probabilistic Counting 


In this section, we consider the problem of counting n. In particular, we assume a uniform random 
scheduler and we want to give protocols that always terminate but still w.h.p. count n correctly. The 
importance of such protocols is further supported by the fact that we cannot guarantee anything 
much better than this. In particular, observe that if we require a population protocol to always 
terminate and additionally to always be correct, then we immediately obtain an impossibility result. 
It is easy to see this by imagining a system in which a unique leader interacts with the other nodes 
(there are no interactions between non-leaders and no connections are ever activated). Any fair 
execution si of a protocol in a population of size n in which the leader outputs n and terminates 
can appear as an “unfair” prefix of a fair execution s' = S 1 S 2 on a population of size n' > n. This is 
a contradiction because in s' the leader must again terminate and output n even though n! ^ n. The 
main reason is that |si| is finite and independent of n; it only depends on the maximum “depth” of 
a chain of rules of the protocol leading to termination. This implies that in s' the leader terminates 
before interacting with all other nodes. 

we present a population protocol with a unique leader that solves w.h.p. the 


In Section 5.1 


counting problem and always terminates. To the best of our knowledge, this is the first protocol of 
this sort in the relevant literature. All probabilistic protocols that have appeared so far, like e.g. 
those in fAAD + 06l IAAE08] , are not terminating but stabilizing and the high probability arguments 
concern their time to convergence. Additionally, this protocol is crucial because all of our generic 
constructors, that are developed in Section [6] are terminating by assuming knowledge of n (stored 
distributedly on a line of length logn). They obtain access to this knowledge w.h.p. by executing 
the counting protocol as a subroutine. Finally, knowing n w.h.p. allows to develop protocols that 
exploit sequential composition of (terminating) subroutines, which makes them much more natural 
and easy to describe than the protocols in which all subroutines are executed in parallel and 
perpetual reinitializations is the only means of guaranteeing eventual correctness (the latter is the 
case e.g. in [GR091 IMCSllal IMS 14] , but not in |MCS12| which was the first extension to allow 
for sequential composition based on some non-probabilistic assumptions). Then in Section 5.2 


we 


comment on the possibility of dropping the unique leader assumption. In particular, we conjecture 
that in general it is impossible to solve the problem if all nodes are identical and we present some 
evidence supporting this. Finally, in Section [5.3| we establish that if the nodes have unique ids then 
it is possible to solve the problem without a unique leader. 


5.1 Fast Probabilistic Counting With a Leader 

Keep in mind that in order to simplify the discussion, a sort of population protocol is presented here. 
So, there are no ports, no geometry, and no activations/deactivations of connections. In every step, 
a uniform random scheduler selects equiprobably one of the n(n— l)/2 possible node pairs, and the 
selected nodes interact and update their states according to the transition function. The only differ¬ 
ence from the classical population protocols is that a distinguished leader node has unbounded local 
memory (of the order of n). In Section 6.1 we will adjust the protocol to make it work in our model. 


Counting-Upper-Bound Protocol: There is initially a unique leader l and all other nodes are 
in state go- Assume that l has two n-counters in its memory, initially both set to 0. So, the state 
of l is denoted as l(ro, r\), where ro is the value of the first counter (call the corresponding counter 
Ro) and rq the value of the second counter (call it Ri), 0 < ro,ri < n. The rules of the protocol 
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are 


(7(r 0 , ri),qo) -> (l(r 0 + 1, n), gi), 

(/(r 0 , n),q i) (/(r 0 , ri + 1), g 2 ), and 

(K r o,n), •) (halt, •) if r 0 = n. 


It is worth reminding that, for the time being, we have disregarded edge-states and, therefore, the 
rules of the protocol only specify how the states of the nodes are updated. Observe that ro counts 
the number of qos in the population while r\ counts the number of gis. Initially, there are n— 1 go s 
and no q\s. Whenever l interacts with a qo, ro increases by 1 and the qo is converted to q\. Whenever 
l interacts with a q\, r i increases by 1 and the q\ is converted to q 2 . The process terminates when 
ro = ri for the first time. We also give to ro an initial head start of b, where b can be any desired 
constant. So, initially we have ?’o = b, r i = 0 and i = #qo = n — b — 1, j = ffq\ = b (this can 
be easily implemented in the protocol by having the leader convert b q^s to q\S as a preprocessing 
step). So, we have two competing processes, one counting qos and the other counting gqs, the first 
one begins with an initial head start of b and the game ends when the second catches up the first. 
We now prove that when this occurs the leader will almost surely have already counted at least 
half of the nodes. 

Theorem 1. The above protocol halts in every execution. Moreover, when this occurs, w.h.p. it 
holds that ro > n/2. 

Proof. Recall that the scheduler is a uniform random one, which, in every step, selects independently 
and uniformly at random one of the n(n — l)/2 possible interactions. Recall also that the random 
variable i denotes the number of qos and j denotes the number of q\s in the configuration, where 
initially i = n — b — 1 and j = b. Observe also that all the following hold: j = tq ~ r\, tq > r\, 
because every conversion of a q\ to q 2 must have been first counted by Ro as a conversion of a qo 
to qi, r\ = (n — 1) — (i + j), and ?’o + r\ is equal to the number of effective interactions (see Figure 

§• 

We focus only on the effective interactions (we also disregard the halting interaction), which 
are always interactions between l and qo or q\. Given that we have an effective interaction, the 
probability that it is an (l,qo) is pij = i/(i + j) and the probability that it is an (l,qi) is qij = 
1—pij = j/(i+j). This random process may be viewed as a random walk (r.w.) on a line with n + 1 
positions 0, 1 ,... ,n where a particle begins from position b and there is an absorbing barrier at 0 
and a reflecting barrier at n. The position of the particle corresponds to the difference ro — r\ of 
the two counters which is equal to j. Observe now that if j > n /2 then ro — n > n/2 ro > n/2, 
so it suffices to consider a second absorbing barrier at n/2. The particle moves forward (i.e. to the 
right) with probability p^ and backward with probability q^ (see Figure [4]). This is a “difficult” 
random walk because the transition probabilities not only depend on the position j but also on the 
sum i + j which decreases in time. In particular, the sum decreases whenever an (l,qi) interaction 
occurs, in which case a q\ becomes q 2 - That is, whenever the random walk returns to some position 
j of the line, its transition probabilities have changed (because every leaving and returning involves 
at least on step to the left, which decreases the sum). Observe also that, in our case, the duration of 
the random walk can be at most n — b, in the sense that if the particle has not been absorbed after 
n — b steps then we have success. The reason for this is that n — b effective interactions imply that 
ro + n = n, but as ro > n, we have ro > n/2. In fact, rg > n/2 O j + r\ > n/2. We are interested 
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Fig. 3. A configuration of the system (excluding the leader). The number of qos remaining is denoted by i. The number 
of Qis introduced so far is denoted by j. The value of the counter Ri is equal to the number of qis encountered so far 
by the leader, which is in turn equal to the number of 92s introduced, and is denoted by n. The value of the counter 
Ro is equal to the number of (fas encountered, which is equal to the number of q\S and 52 s introduced and is denoted 
by r 0 . 


in upper bounding Pffailure] = P[reach 0 before ro > n/2 holds], which is in turn upper bounded 
by the probability of reaching 0 before reaching n/2 and before n — b effective interactions have 
occurred (this is true because, in the latter event, we have disregarded some winning conditions 
like, for example, guaranteed winning in (n/2) + r\ effective interactions, in which case we have 
winning in only (n/2) + n effective interactions and j having become at most (n/2) — n). It suffices 
to bound the probability of reaching 0 before n effective interactions have occurred. 


O-O-Q 
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'la = 1 -Pij Pij 


o-o-a- 

b 


O 

n /2 


Fig. 4. A random walk modeling of the probabilistic process that the Counting-Upper-Bound protocol implements. 
A particle begins from position b. The position j of the particle corresponds to the difference between ro and n. 
Forward movement corresponds to an increment of ro and backward movement corresponds to an increment of n. 
Absorption at 0 corresponds to n becoming equal to ro and thus to termination (and to failure if this occurs before 
r 0 > n/2 holds). Absorption at n/2 corresponds to r 0 becoming at least n/2 (before being absorbed at 0) and thus 
to success. 


Thus, we have ro + r\ < n but r\ < ?’o =>• 2ri < ro + ?T, thus 2ri <n=^ri<n/2=^ 
(n — 1) — (i + j) < n/2 => z + j > (n/2) — 1 = n!. And if we set n' = (n/2) — 1 we have z + j > n!. 
Moreover, observe that when ro + r\ = n + 1 we have n + 1 = ro + ri < 2ro =4> ro > n/2. In 
summary, during the first n effective interactions, it holds that i + j > n' = (n/2) — 1 and when 
interaction n + 1 occurs it holds that ro > n/2, that is, if the process is still alive after time n, then 
ro has managed to count up to n/2 and the protocol has succeeded. 

Now, i + j > n' implies that pj > (n/ — j)/n' and qj < j/n' so that now the probabilities only 
depend on the position j. This new walk is the well-studied Ehrenfest random walk coming from 
the theory of brownian motion. Imagine gas molecules that move about randomly in a container 
which is divided into two halves symmetrically by a partition. A hole is made in the partition 
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to allow the exchange of molecules between the subcontainers. Suppose there are n molecules in 
the container. Think of the partitions as two urns, I and II, containing balls labeled 1 through n. 
Molecular motion can be modeled by choosing a number between 1 and n at random and moving 
the corresponding ball from the urn it is presently in to the other. This is a historically important 
physical model, known as the Ehrenfest model of diffusion, introduced in |EEA07| in the early days 
of statistical mechanics to study thermodynamic equilibrium. So, the probability of failure of our 
counting protocol is asymptotically equivalent to the probability that urn I becomes empty in the 
first n steps assuming that it initially contains b balls. This walk has been studied by Kac in [Kac471| 
who, among other things, proved that the mean recurrence time is ((R + k)\(R — k)\/(2R)\)2 2R 
(IE ac47) . page 386). If we set k = —R so that the initial position is R + k = 0, then this evaluates 
to 2 2R = 2 ri//2 , because 2R is the total length of the line. This shows that, even if we begin from 
position 0 instead of b, the recurrence time is expected to be huge and we do not expect the walk 
to return to 0 and fail in only n effective steps. In the sequel, we turn this into the desired high 
probability argument. 

We will reduce the Ehrenfest walk to one in which the probabilities do not depend on j. We first 
further restrict our walk, this time to the prefix [0, 6] of the line. In this part, it holds that j < b 
which implies that p > [r>! — b)/n' and q < b/n'. Now we set p = {n' — b)/n' and q = b/n'. Observe 
that this may only increase the probability of failure, so the probability of failure of the new walk is 
an upper bound on the probability of failure of our original walk. Recall that initially the particle 
is on position b. Imagine now an absorbing barrier at 0 and another one at b. Whenever the r.w. is 
on b — 1 it will either return to b before reaching 0 or it will reach 0 (and fail) before returning to 
b. So, we now have a r.w. with 6 + 1 positions, where positions 0 and b are absorbing and due to 
symmetry it is equivalent to assume that the particle begins from position 1, moves forward with 
probability p' = q, backward with probability q' = p , and it fails at b. Thus, it is equivalent to 
bound P[reach b before 0 (when beginning from position 1)]. This is the probability of winning in 
the classical ruin problem analyzed e.g. in |Fel68j page 345. If we set x = q'/p' = p/q = (n 1 — b)/b 
we have that: 

P [reach b before 0] = 1 — -— = X ,—— 

L J x b - 1 x b - 1 

x 1 

~ x b — 1 x b ~ l 

1 


Thus, whenever the original walk is on b — 1, the probability of reaching 0 before reaching b 
again, is at most 1/n 6 ” 1 . Now assume that we repeat the above walk n times, i.e. we place the 
particle on b — 1, play the game, then if it returns to b we put again the particle on b — 1 and play 
the game again, and so on. From Boole-Bonferroni inequality, we have that: 


n 

P [fail at least once] < P [fail at repetition m] 
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In summary, even if the protocol was restricted to disregard counter differences that are greater 
than 5, still with probability at least 1 — 1 /n c (for constant c = 6 — 2 ) the protocol has not terminated 
after at least n effective interactions, which in turn implies that the leader has counted at least half 
of the nodes. □ 

Remark 1. For the Counting-Upper-Bound protocol to terminate, it suffices for the leader to meet 
every other node twice. This takes twice the expected time of a meet everybody (cf. |MS14] ). thus 
the expected running time of Counting-Upper-Bound is 0(n 2 log n) (interactions). 

Remark 2. When the Counting-Upper-Bound protocol terminates, w.h.p. the leader knows an ro 
which is between n/2 and n. So any subsequent routine can use directly this estimation and pay in 
an a priori waste which is at most half of the population. In practice, this estimation is expected 
to be much closer to n than to n /2 (in all of our experiments for up to 1000 nodes, the estimation 
was always close to (9/10)n and usually higher). On the other hand, if we want to determine 
the exact value of n and have no a priori waste then we can have the leader wait an additional 
large polynomial (in ro) number of steps, to ensure that the leader has met every other node w.h.p. 
(observe e.g. that the last unvisited node requires an expected number of @(n 2 ) steps to be visited). 

5.2 Impossibility of Counting Without a Leader 

An immediate question is whether the unique leader assumption of Theorem [T] can be dropped. 
Unfortunately, the answer to this question seems to be negative. In particular, it seems that any 
protocol in which all nodes begin from the same state may have some node terminate with (at least) 
constant probability having participated in only a constant number of interactions. This implies 
that with constant probability the protocol terminates without having estimated any non-constant 
function of n. 

Nodes again have a set of states Q and we also assume that they have unbounded private local 
memories. These memories are for internal purposes only and their contents are not communicated 
to other nodes. For example, a node u could maintain \Q\ counters, each counting the number of 
times the corresponding state has been encountered so far by u. We focus on protocols that always 
terminate (i.e for every n > no, for some finite no) and we want them to compute something w.h.p., 
e.g. the node that first terminates to know an upper bound on n w.h.p.. 

Conjecture 1. Let A be a protocol as above. Then, as n grows, there is (at least) a constant 
probability that some node terminates having interacted only a constant number of times. 

We now give some evidence why the above conjecture, that excludes the possibility of protocols 
that count n w.h.p., seems to be true. First of all, observe that a protocol apart from the usual 
transition function 5 : Q x Q —> Q x Q that updates the communicating states has also a function 
7 : Q x S —> S that updates the internal memory based on the encountered states. We focus on 
deterministic 7 and in this case the internal state from S after k interactions only depends on 
the observed sequence Q k of encountered states (because the initial state qo is always the same 
for all nodes). Every protocol A that always terminates, essentially defines a property 4 C Q* 
consisting of those observed sequences that make a node terminate (the remaining sequences do 
not cause termination). Moreover, as the protocol does not know n, an so E La of minimum length 
has length that is independent of n (it could only be a function of |Q|). Observe that for every 
population size n, if sq is observed by some node u as a prefix of its interaction pattern (i.e. in its 
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first | So | interactions) then u terminates while having participated in only |so| interactions, which 
is a constant number independent of n. What it seems to hold is that, for every n>no and every 
such fixed so, there is (at least) a constant probability that some node observes so- In particular 
we believe that it might be possible to prove the following set of arguments provided that ?j> no: 

1. With constant probability a configuration is reached, in which every state q G Q has multiplicity 
0(n) (that is appears on 0(n) distinct nodes). 

2. With constant probability the multiplicities of all states remain 0(n) for 0(n) steps. 

3. While (2) holds, with constant probability one of the 0(n) nodes, let it be u, whose state is qo, 
interacts |so| times. 

If the above hold, then it follows that u may observe -so with constant probability, in which case 
u will terminate having interacted only a constant (i.e. |so|) number of times. The reason for this 
is that in its zth interaction, for all 1 < i < |so|, u observes the ith state of so, let it be q%, with 
probability (#qi in the population)/0(n). As, by (2), the numerator is also 0(n), for all q % G Q, 
the resulting probability is constant. Unfortunately, we have not yet been able to turn this into a 
formal proof. 

5.3 Counting Without a Leader but With UIDs 

Now nodes have unique ids from a universe U. Nodes initially do not know the ids of other nodes 
nor n. The goal is again to count n w.h.p.. All nodes execute the same program and no node can 
initially act as unique leader, because nodes do not know which ids from U are actually present in 
the system. Nodes have unbounded memory but we try to minimize it, e.g. if possible store only 
up to a constant number of other nodes ids. We show that under these assumptions, the counting 
problem can be solved without the necessity of a unique leader. 

5.3.1 A Simple Protocol We first show feasibility by a very simple protocol which guarantees 
that the nodes w.h.p. count n by paying a large termination time. 

Protocol: Every node u remembers its first b interactions, where b is a predetermined constant. 
In particular, it maintains a vector v u of length b and in every interaction i, 1 < i < b, with a 
node with v it sets v u (i) <— id v . Also u counts the number of distinct nodes that it has interacted 
with so far, by placing their ids in an A u list. Initially A u {v u } U {id u } and in every interaction 
with a node v, u sets A u A u U {id v }. Moreover, after interaction b, u keeps track of the ids 
encountered in every b consecutive interactions, in another vector v' u of length b, initially empty. 
Whenever a sequence of length b is recorded (i.e. the vector is full), if v u =v' u then u outputs \A U \ 
and terminates, otherwise u clears the contents of v' u and starts recording the next b interactions. 

Theorem 2. When a node u terminates in the above protocol, w.h.p. \A U \ = n. The expected 
termination time is b(n — l) b = 0{n b ). 

Proof. Given the initial sequence of length b , i.e. v u , the probability that a sequence of b consecutive 
interactions observed by u is equal to v u is l/(n — l) fc and thus the expected time for this to occur 
is b(n— l) b interactions of u and as u participates on average every n steps, it is a total of bn(n — l) 6 
interactions. But there are n nodes doing the same independently of one another, thus the actual 
expected time for one of the nodes to terminate is b(n — l) b = 0{n b ). On the other hand, the 
expected time for any of the nodes to meet every other node is only 0(nlogn). □ 
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5.3.2 An Improved Protocol We now give a protocol that improves the expected time to ter¬ 
mination still guaranteeing correct counting w.h.p.. The idea is to have the node with the maximum 
id in the system to perform the same process as the unique leader in the protocol with no ids of 
Theorem [I] Of course, initially all nodes have to behave as if they were the maximum (as they do 
not know in advance who the maximum is). By comparing ids during an interaction and deactivat¬ 
ing the smaller one, we can easily guarantee that eventually only the maximum, Umax , will remain 
active. Moreover, it is clear that u max can always win the other nodes in every interaction, so we 
easily ensure that its process is not affected by the other nodes. However, we must also guarantee 
that no other node ever terminates (with sufficiently large probability) early, giving as output a 
wrong count. 

Informal description: Every node u has a unique id id u and tries to simulate the behavior 
of the unique leader of the protocol of Theorem [lj In particular, whenever it meets another node 
for the first time it wants to mark it once and the second time it meets that node it wants to mark 
it twice, recording the number of first-meetings and second-meetings in two local counters. The 
problem is that now many nodes may want to mark the same node. One idea, of course, could be to 
have a node remember all the nodes that have marked it so far but we want to avoid this because it 
requires a lot of memory and communication. Instead, we allow a node to only remember a single 
other node’s id at a time. Every node tries initially to increase its first-meetings counter to b so 
that it creates an initial b head start of this counter w.r.t. the other. Every node that succeeds 
starts executing its main process. The main idea is that whenever a node u interacts with another 
node that either has or has been marked by an id greater than id u , u becomes deactivated and 
stops counting. This guarantees that only u max will forever remain active. Moreover, every node 
u always remembers the maximum id that has marked it so far, so that the probabilistic counting 
process of a node u can only be affected by nodes with id greater than id u and as a result no one 
can affect the counting process of u max • Protocol [3] puts all these together formally and Theorem 
[3] shows that this process correctly simulates the counting process of Theorem [lj thus providing 
w.h.p. an upper bound on n. 

Theorem 3. When a node u in Protocol\^halts, w.h.p. it holds that u = u max and that 2-countl u > 
n. 

Proof. We first show that u max simulates the probabilistic process of the unique leader l of Theorem 
[lj Recall that in the protocol of Theorem [lj all other nodes are initially go and when l meets a go it 
makes it gi and when it meets a gi it makes it g 2 , every time counting in the corresponding counter. 
First, observe that u max is never deactivated, i.e. active Umax = 1 forever, because it never interacts 
with a greater id nor with a node that belongs to a greater id than its own. It suffices to show 
that when u max meets a node for the first time it marks it once (simulating a go to gi conversion), 
when it meets a node for the second time it marks it twice (simulating a q\ to g 2 conversion), and 
that no other node can ever alter the nodes marked by u max . When u max interacts with a node v 
for the first time, then either belongs v =T or _L^ belongs v < maxJd. So, in this case it marks v 
once by setting marked v 4 — 1, belongs v 4 — maxJd, and records this by increasing countl Ujnax by 
one. From now on, no other active node w / u max can ever affect the state of v, because for every 
such w it holds that id w < belongs v = maxJd and the only effect in this case is the deactivation 
of w. The second time that u max interacts with v, it still holds that belongs v = id Umax (= maxJd ) 
and marked v = 1, and u max marks v for a second time by setting marked v 4 — 2 and records this 
by incrementing count2 Umax by one. Again, v still belongs to maxJd and no other node can ever 
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Protocol 3 Counting with UIDs 

Initialization: Every node u has a unique id id u and maintains a (belongs, marked) pair, a (countl, count2) pair, 
and a variable active, where belongs Git U {_L} initially _L, marked G {0,1, 2} initially 0, countl, count2 G !N>o 
initially countl = count2 = 0 and active G {0,1} initially 1. All nodes know a predetermined constant b > 0. 
The following is the code for every interaction of u,v with id u > id v . 

1 : if active v = 1 then 
2: active v G- 0 

3: end if 

4: if active u = 1 then 

5: if belongs v =_L or _L^ belongs v < id u then 

6 : belongs v G- id u 

7: marked v G- 1 

8 : countl u g- countl u + 1 

9: end if 

10 : if J_ 7 ^ belongs v > id u then 

11: active u G- 0 

12 : end if 

13: if belongs v — id u and markedv = 1 and countl u > b then 

14: marked v G- 2 

15: count2 u G- count2 u + 1 

16: if countlu = count2 u then 

17: u halts and outputs 2 ■ countl u 

18: end if 

19: end if 

20 : end if 


affect its state. We conclude that if we were only interested in u max s output then, by Theorem [TJ 
this would w.h.p. be an upper bound on n. 

However, observe that not only u max but also the other nodes execute a similar process and it 
could be the case that one of them terminates early (and before u max ) giving as output a wrong 
count. We now show that this is not the case. Take any node w with id w < maxJd. Consider the 
partition of U\{u>} into the sets S Wt o, S Wj i, and S w p of nodes which w has not marked yet, has 
marked once, and has marked twice, respectively. S w i cannot increase without w being involved, so 
the only possibility that may increase the rate of growth of count2 w w.r.t. count\ w is when a node 
v £ S w fl gets marked by a node with id greater than id w , because such a v can no longer contribute 
to countl w . However, observe that every such v will from now on forever satisfy belongs v > id u ,, 
because belongs v can only increase and every interaction of w with such a v will result in the 
deactivation of w. This implies that the “success” events of w (those corresponding to a count l w 
increment) have now been partitioned into increment events and deactivation events. So, if w ever 
fails to increment countl w due to an interference of some u with id u > id w on some v £ S Wt o, 
the effect is the deactivation of w, which clearly does not allow w to continue with unfavorable 
probabilities. □ 


6 Generic Constructors 


In this section, we give a characterization for the class of constructible 2D shape languages. In 
particular, we establish that shape constructing TMs (defined in Section [3]), can be simulated by 
our model and therefore we can realize their output-shape in the actual distributed system. To this 
end, we begin in Section 6T by adapting the Counting-Upper-Bound protocol of Section [5] to work 
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in our model. The result is, again w.h.p., a line of length (9(logn) with a unique leader containing 
n in binary. Then, in Section 6.2 the leader exploits its knowledge of n to construct a y/n x 


n 

square. In the sequel (Section 6.3), it simulates the TM on the square n distinct times, one for each 
pixel of the square. Each time, the input provided to the TM is the index of the pixel and y/n, both 
in binary. Each simulation decides whether the corresponding pixel should be on or off. When all 
simulations have completed, the leader releases in the solution, in a systematic way, the connected 
shape consisting of the on pixels and the active edges between them. The connections of all other 
(off) pixels become deactivated and the corresponding nodes become free (isolated) nodes in the 
solution. 


6.1 Storing the Count on a Line 

We begin by adapting the Counting-Upper-Bound protocol of Theorem [T] so that when the protocol 
terminates the final correct count is stored distributedly in binary on an active line of length log n. 

Counting-on-a-Line Protocol: The probabilistic process that is being executed is essentially the 
same as that of the Counting-Upper-Bound protocol. Again the protocol assumes a unique leader 
that forever controls the process. A difference now is that every node has four ports (in the 2D 
case). The leader operates as a TM that stores the ro and ri counters in its tape in binary. The 
ith cell of the tape has two components, one storing the ith bit of ro and the other storing the 
Ah bit of ri. We say that the tape is full, if the bits of all ro components of the tape are set to 
1. The tape of the TM is the active line that the leader has formed so far, each node in the line 
implementing one cell of the tape. Initially the tape consists of a single cell, stored in the memory 
of the unique leader node. As in Counting-Upper-Bound, the leader first tries to obtain an initial 
advantage of b for the ro counter. To achieve the advantage, the leader does not count the q\s 
that it interacts with until it holds that ro > b. Observe that the initial length of the tape is not 
sufficient for storing the binary representation of b (of course b is constant so, in principle, it could 
be stored on a single node, however we prefer to keep the description as uniform as possible). In 
order to resolve this, the leader does the following. Whenever it meets the left port of a q$ from its 
right port, if its tape is not full yet, it switches the qo to q \, leaving it free to move in the solution, 
and increases the ro counter by one. To increase the counter, it freezes the probabilistic process 
(that is, during freezing it ignores all interactions with free nodes), and starts moving on its tape, 
which is a distributed line attached to its left port. After incrementing the counter, the leader keeps 
track of whether the tape is now full and then it moves back to the right endpoint of the line to 
unfreeze and continue the probabilistic process. On the other hand, if the tape is full, it binds the 
encountered qo to its right by activating the connection between them (thus increasing the length 
of the tape by one), then it reorganizes the tape, it again increases ro by one, and finally moves 
back to the right endpoint to continue the probabilistic process. This time, the leader also records 
that it has bound a qo that should have been converted to q\. This debt is also stored on the tape 
in another counter r 2 . Whenever the leader meets a q^, if ri > 1, it converts q 2 to q\ and decreases 
r 2 by one. So, (/ 2 s may be viewed as a deposit that is used to pay back the debt. In this manner, 
the go s that are used to form the tape of the leader are not immediately converted to <71 when first 
counted. Instead, the missing q\s are introduced at a later time, one after every interaction of the 
leader with a 72 , and all of them will be introduced eventually, when a sufficient number of 72 s will 
become available. Finally, whenever the leader interacts with the left port of a q± from its right 
port, it freezes, increases the rq counter by one (observe that ro > rq always holds, so the length 
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of the tape is always sufficient for n increments), and checks whether ro = r\. If equality holds, 
the leader terminates, otherwise it moves back to the right endpoint and continues the process. 
Correctness is captured by the following lemma. 

Lemma 1. Counting-on-a-Line protocol terminates in every execution. Moreover, when the leader 
terminates, w.h.p. it has formed an active line of length logn containing n in binary in the ro 
components of the nodes of the line (each node storing one bit). 

Proof. We begin by showing that the probabilistic process of the Counting-Upper-Bound protocol is 
not negatively affected in the Counting-on-a-Line protocol. This implies that the high probability 
argument of Theorem [I] holds also for Counting-on-a-Line (in fact it is improved). First of all, 
observe that the four ports of the nodes introduce more choices for the scheduler in every step. 
However, these new choices, if treated uniformly, result in the same multiplicative factor for both 
the “positive” (an (l,qo) interaction) and the “negative” (an ( l,q \) interaction) events, so the 
probabilities of the process are not affected at all by this. Moreover, neither the debt affects the 
process. The reason is that the only essential difference w.r.t. to the process is that the conversion 
of some counted qos to the corresponding qis is delayed. But this only decreases the probability of 
early termination and thus of failure. It remains to show that not even a single q\ remains forever 
as debt, because, otherwise, some executions of the protocol would not terminate. The reason is 
that the protocol cannot terminate before converting all the q\s plus the debt to q- 2 - To this end, 
observe that the line of the leader has always length [lgcoj + 1, thus r 2 < [lg roj, because the debt 
is always at most the length of the line excluding the initial leader. So, at least ro — [lg^oj nodes 
have been successfully converted from qo to q\ which implies that there is an eventual deposit of 
at least ro — [lg r*oJ nodes in state q 2 - These q 2 S are not immediately available, but they will for 
sure become available in the future, because every interaction of the leader with a q\ results in a 
q 2 ■ Finally, observe that ro — [lg r*oJ > [lg /’oj holds for all ro > 1 (to see this, simply rewrite it as 
r o/2 > |_lg r oJ)- Thus, ro — |_lg r oJ > r 2 , which means that the eventual deposit is not smaller than 
the debt, so the protocol eventually pays back its debt and terminates. □ 


6.2 Constructing a y/n X y/n Square 


We now show how to organize the nodes into a spanning square, i.e. a 
Section 


4.2 


we again assume for simplicity that 


n x y/n one. As we did in 
n is integer. Observe that now the leader has n 


stored in its line. Our construction exploits this knowledge and this makes it essentially different 


than the constructions of Section 4.2 Moreover, knowledge of n allows the protocol to terminate 
after constructing the square and to know that the square has been successfully constructed, a 


fact that was not the case in the stabilizing constructions of Section 4.2 The following protocol 
assumes that the guarantee of Lemma [l] is provided somehow and based on this assumption we 
will show that it works correctly in every execution (this is in contrast to the high probability 
argument of Lemma [I]). This means that given the guarantee, the protocol that constructs the 
square is always correct. Of course, if we take the composition of Counting-on-a-Line that provides 
the guarantee and the protocol that constructs the square based on the guarantee, the resulting 
protocol is again correct w.h.p., however we still allow the possibility that some other deterministic 
(even centralized) preprocessing provides the required guarantee. 


Square-Knowing-n Protocol: The initial leader L first computes y/n on its line by any plausible 
algorithm (observe that the available space for computing the square root is exponential in the 
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binary representation of n, which is the input to the algorithm, because, if needed, the leader can 
expand its line up to length n). In principle, it is not necessary to use additional space, because 
the leader can execute one after the other the multiplications 1-1, 2-2, 3-3, ... in binary until 
the result becomes equal to n. Each of these operations can be executed in the initial logn space 
of the line of the leader. The time needed, though exponential in the binary representation of n, 
is still linear in the population size n. Now that the leader also knows i /n, it expands its line to 
the right by attaching free nodes to make its length y/n. Then it exploits the down ports to create 
a replica of its line. The replica has also length y/n and has its own leader but in a distinguished 
state L s . This new line plays the role of a seed that starts creating other self-replicating lines of 
length y/n. In particular, the seed attaches free nodes to its down ports, until all positions below 
the line are filled by nodes and additionally all horizontal connections between those nodes are 
activated. Then it introduces a leader L r to one endpoint of the replica and starts deactivating the 
vertical connections to release the new line of length y/n. These lines with L r leaders are totally 
self-replicating, meaning that their children also begin in state L r . The initial leader L waits until 
the up ports of a non-seed replica r become totally aligned with the down ports of the square 
segment that has been constructed so far. So, initially it waits until a replica becomes attached to 
the lower side of its own line. When this occurs, it activates all intermediate vertical connections 
to make the construction rigid and increments a row-counter by one (initially 0) and moves to the 
new lowest row. If at the time of attachment r was in the middle of an incomplete replication, then 
there will be nodes attached to the down ports of r. L releases all these nodes, by deactivating 
the active connections of r to them, and then waits for another non-seed replica to arrive. When 
the row-counter becomes equal to y/n — 1, the leader for the first time accepts the attachment of 
the seed to its construction and when the seed is successfully attached the leader terminates. This 
completes the construction of the y/n x y/n square. See Figures [5] and [6] for illustrations. 

The reason for attaching the seed last, and in particular when no further free nodes have 
remained, is that otherwise self-replication could possibly cease in some executions. Observe also 
that we have allowed the L-leader to accept the attachment of a replica to the square segment even 
though the replica may be in the middle of an incomplete replication. This is important in order 
to avoid reaching a point at which some free lines are in the middle of incomplete replications but 
there are no further free nodes for any of them to complete. For a simple example, consider the 
seed and a replica r and y/n free nodes (all other nodes have been attached to the square segment). 
It is possible that y/n — 1 of the free nodes become attached to the seed and the last free node 
becomes attached to r. We have overcome this deadlock by allowing L to accept the attachment of 
r to the square segment. When this occurs, the free node will be released and eventually it will be 
attached to the last free position below the seed. 

We now give, in Protocol [4j one of the possible codes for the replication process of the original 
leader’s line that creates the seed. The other replication processes, i.e. of L s to L r and of L r to L r , 
are almost identical to this one. Without loss of generality we assume that the original leader’s line 
has state L on its left endpoint, e on its right endpoint, and every other internal node of the line 
is in state i. All other (free) nodes are in state qo■ 

We additionally show that, in principle, the lines do not need a leader in order to successfully 
self-replicate. We give such a protocol which is “more parallel” and has a much more concise 
description than the previous one. We now assume that one line has e on both of its endpoints and 
i on the internal nodes, and every (free) node is in state qo. The code is presented in Protocol [5j 
The protocol works as follows. Free nodes are attached below that nodes of the original line. When 


24 


original line 

L' i %' i! i' i' i 



Fig. 5. (a) Several free nodes have already been attached to the original line. Some of them have already activated 
some horizontal connections forming some segments of the replica, (b) The leader (Z/) of the original line remains 
blocked while the leader (L®) of the replica has detected that the replica is ready for detachment. It has already 
detached the three rightmost nodes and keeps moving to the left until it reaches the left endpoint and detaches the 
whole replica, (c) The seed replica has been released in the solution. The leader (l/) of the original line has waken 
up and is restoring the nodes of its line to their original states. When it finishes (that is, when it will have traversed 
the whole line and have returned to the left endpoint), it will go to state Lstart to start the square formation process. 
Similarly, the leader (Z*) of the seed replica is setting the nodes of its line to their normal i and e states, so that they 
start accepting the attachment of other nodes in order to create non-seed replicas. 
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seed replica 

0-0-0-0-0-O-0-O-Q 



original line square segment 



Fig. 6. The seed at the top has created another replica which has just been released in the solution. Below it, some 
additional replicas appear. One of them is in the middle of a replication that has not completed yet. There are also 
several nodes that are still free. At the bottom appears the square segment that has been constructed so far. The 
original line of the L-leader is the one at the top of the rectangle. The other rows below it have been formed by 
replicas that have been attached to the segment in previous steps. The L-leader keeps waiting at the bottom left 
corner for new replicas to arrive. One such has just arrived and will be attached to the segment. 
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Protocol 4 Line-Replication 


Q = {L,L',Li,L t a ,L t f,lS e ' 
5: 


L*'. T/ .iJ", 


L start , i, i , e }, j £ { 1 , 2 , . . . , 7 } 


(L,d),(qo,u),0^(L',Ll,l) 

(i,d), (g 0 , u),0-> (i', i', 1) 

(e,d), (g 0 ,w), 0—>■ (e',e',l) 

(i',0,0 -> 1) 

(i',r),(e',Z),0-> (i',e',l) 

(L^, »•),(*', I), 0->-(e',l£l) 

( L 2 s ,r),(i\l ),■ -> (i',L?,l) 
(L 2 s ,r),(e',l),-^ (i',L 3 81 l) 
(L 3 s ,u),(e\d),l^(Lt,e',0) 
(i',r),(L 4 a ,l),l^(L 5 a ,e',l) 
(L 5 a ,u),(i\d),l^ (L 6 s ,i\ 0) 

(i’,r),(L 6 s , Z),l -+(LS,t',l) 

(e , ,r),(L«,0.1->’ (il.i'.l) 
(Ll,u),(L',d),l-*-(ii,i*,0) 

(**, r), (*', Z), 1 ->■ (e', x* , 1), ® £ (L, L s } 
(a:* , r), (*', Z), 1 -¥ (i' , x t , 1), x 6 {L, Z, s } 
(x*' ,r),(e',l), 1 ->• (®‘ ,e, l),as € {h,h s } 
(*V),(a7 , Z), 1 —>• (a;* ,i, 1),* £ {L,L a } 
(e',r), (L* , Z), 1 —(i s ,i, 1) 

(e',r), (L* , Z), 1 —>• {L st art,i, 1) 


a node is attached below an internal node i both become i\ and when a node is attached below 
an endpoint e, both become ei. Moreover, adjacent nodes of the replica connect to each other and 
every such connection increases their index. In fact, their index counts their degree. An internal 
node of the replica can detach from the original line only when it has degree 3, that is when, apart 
from its vertical connection, it has also already become connected to both a left and a right neighbor 
on the replica. On the other hand, an endpoint detaches when it has a single internal neighbor. It 
follows that the replica can only detach when its length (counted in number of horizontal active 
connections) is equal to that of the original line. To see this, assume that a shorter line detaches 
at some point. Clearly, such a line must have at least one endpoint that corresponds to an internal 
node ij of the replica. But this node is an endpoint of the shorter line, so its degree is less than 3, 
i.e. j < 3, and we conclude that it cannot have detached. 

Lemma 2. There is a protocol (described above) that when executed on n nodes (for all n with 
integer y/n) w.h.p. constructs a \Jn x y/n square and terminates. 

Proof. From Lemma [TJ when the leader in Counting-on-a-Line protocol terminates, w.h.p. it has 
formed an active line of length log n containing n in binary in the r q components of the nodes of 
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Protocol 5 No-Leader-Line-Replication 

Q = {go,12,13} 

5: 

(g 0 , u),0 ->■ 1) 

(e,d), (go,u),0 (ei,ei, 1) 

(4,Z),0-» (*i+i>*fc+i. 1) for all j,k G {1,2} 
(h,r), (ei,Z),0 ->• (* 2 ,e 2 ,1) 

(4,r), (ei,Z),0 ->• (*3,e 2 ,1) 

(ei,r), (ii,(),0^ (e 2 ,4,1) 

(ei,r), (i 2 ,l),0-> (e 2 ,i 3 ,1) 

(» 3 ,«), (*i,d),l (i,*,0) 

(e 2 ,M),(ei,d),l ->• (e, e, 0) 


the line. Then the leader computes y/n on its line and expands its line to make its length y/n. 
Next the leader creates the seed replica by executing the routine described in Protocol [4j The 
seed replica keeps creating new self-replicating replicas. All these replications are performed by 
a routine essentially equivalent to Protocol [4j Every replica is a line of length y/n and will be 
eventually attached to the square-segment to form another row of the square. First observe that 
the seed may only be attached to the square, when the square has already obtained y/n — 1 rows. 
This implies that replications do not cease before the square has been successfully constructed. 
Additionally, any non-seed replica r can be attached to the square-segment (whenever the l leader 
is in the state of waiting for new attachments) independently of whether r is in the middle of an 
incomplete replication. The reason is that attachment occurs via the up ports of r while replication 
takes place via the down ports of r. If this occurs, then the nodes of the incomplete replication are 
simply released as free nodes. So, assume that there are k nodes that are either free or part of an 
incomplete replication. We only have to prove that as long as k > y/n then eventually another replica 
has to be formed. If not, then for an infinite number of steps it holds that k > y/n. Moreover, every 
non-seed replica in a finite number of steps becomes attached to the square-segment and releases 
any nodes of an incomplete replication. Thus, in a finite number of steps, every one of the k > y/n 
nodes is either free or part of an incomplete replication of the seed. Clearly, given that the seed does 
not cease self-replication and given that there are enough nodes to fill the y/n replication positions 
of the seed, in a finite number of steps (due to fairness) all these positions should have been filled 
and a replica should have been created. Thus, the assumption that no further replication occurs 
violates the fairness condition. □ 


6.3 Simulating a TM 

We now assume as given (from the discussion of the previous section) a y/n x y/n square with a 
unique leader L at the bottom left corner. However, keep in mind that, in principle, the simulation 
described here can begin before the construction of the y/n x y/n square is complete. The only 
difference in this case, is that the two processes are executed in parallel and if at some point the 
TM needs more space, it has to wait until it becomes available. The square may be viewed as a 
TM-tape of length n traversed by the leader in a “zig-zag” fashion, first moving to the right until 
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the bottom right corner is encountered, then one step up, then to the left until the node above 
the bottom left corner is encountered, then one step up again, then right, and so on. To simplify 
this process, we may assume that a preprocessing has marked appropriately the turning points (see 
Figure [7(b)[ ). The tape will be used to simulate a TM M of the form described in the Section |3j 
The n pixels of the square are numbered according to the above zig-zag process beginning from 
the bottom left node, each node corresponding to one pixel. The space available to the TM is 
exponential in the binary representation of the input (i,n) (or (i, y/n)), because i < n — 1 and 
therefore the length of its binary representation |i| = O(logn), thus |(i,n)| = O(logn), but the 
available space is 0(n ) = 0(2 logn ) = fl(2^ i,n ^) (still it is linear in the size of the whole shape to 
be constructed). 

The protocol invokes n distinct simulations of M, one for each of the pixels i 6 {0,1,. .., n — 1} 
beginning from i = 0 and every time incrementing i by one. The leader maintains the current value 
of i in binary, in a pixel-counter pixel stored in the O(logn) leftmost cells of the tape. [^Recall 
that the leader knows n from the procedures of the previous sections. So, we may assume that the 
tape also holds in advance n and y/n in binary (again in the leftmost cells). Initially pixel = 0 
and the leader marks the 0th node, that is the bottom left corner of the square. Then it simulates 
M on input (pixel, y/n). When M decides, if its decision is accept, the leader marks the node 
corresponding to pixel as on, otherwise it marks it as off. Then the leader increments pixel by 
one, marks the node corresponding to the new value of pixel (which is the next node on the tape), 
clears the tape from residues of the previous simulation, invokes another simulation of M on the 
new value of pixel, and marks the corresponding node as on or off according to M’s decision. The 
process stops when pixel = n, in which case no further simulation is executed. Alternatively, the 
leader can detect termination by exploiting the fact that the last pixel to be examined is the one 
corresponding to the upper left or right corner of the square (depending on whether y/n is even or 
odd), which can be detected. 

When the above procedure ends, the leader starts walking the tape in the opposite direction 
until it reaches the bottom left corner. In the way, it passes a release signal to every node it goes 
through. A node enters the release phase exactly when the leader departs from that node, apart 
from the bottom left corner which enters the release phase when the leader arrives. When two nodes 
that are both in the release phase interact, if at least one of them is off and their connection is 
active, they deactivate the connection. Clearly, the only nodes that will remain connected in the 
solution are the on nodes forming the desired connected 2-dimensional shape that M computes. If 
we additionally require the leader to know when all deactivations have completed and terminate, 
then we can either (i) have the leader deactivate them itself while moving backwards, also ensuring 
that it does not remain on a node that will be released, or (ii) have the leader repeatedly explore 
the final connected shape until it detects that all potential deactivations have occurred. 

The following theorem states the lower bound implied by the construction described in this 
section. 

Theorem 4. Let C = (Si, S' 2 ,...) be a connected 2D shape language, such that C is TM-computable 
in space d 2 . Then there is a protocol (described above) that w.h.p. constructs C. In particidar, for 
all d > 1, whenever the protocol is executed on a population of size n = d 2 , w.h.p. it constructs 

When we refer to the tape, we mean the line produced by traversing the square in a zig-zag way beginning from the 
bottom-left node, as described above. So the “leftmost”, here, corresponds to the leftmost nodes of the line, e.g. 
the left part of the bottom row of the square, and should not be confused with the nodes on the leftmost column 
of the square. 
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(c) (d) 


Fig. 7. (a) The ffn x ffn square has just been constructed, (b) The virtual tape on the square. The arrows show the 
direction in which the tape is traversed from left to right (opposite arrows for the opposite direction are not shown). 
The two endpoints of the tape are marked as black here and the turning points are marked as gray. These facilitate 
the leader to detect and choose the right action, e.g. turn left twice (equivalently, follow the up port and then the 
left port) when it arrives at the bottom right corner and wishes to continue on the second row. The indices of the 
pixels that the procedure assumes, follow the order of the tape, that is the first position of the tape corresponds 
to pixel 0, the second to pixel 1,..., the last position of the tape to pixel n — 1. (c) The shape, which looks like a 
star, has been formed on the square. It consists of the pixels that the TM M decided to be on, which are colored 
black here. All other white pixels are the off pixels. The simulations have completed and the leader has just reached 
the upper right corner and now it starts releasing the shape. To improve visibility, the edges that will eventually be 
deactivated appear as dotted here, (d) Releasing is almost complete. The leader has reached the bottom left corner 
and has updated all nodes to the release phase. Any connection involving at least one off node (i.e. a white one) will 
be eventually deactivated. 
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Sd and terminates. In the worst case, when Gd (that is, the shape of Sd) is a line of length d, the 
waste is (d — 1 )d = 0(d 2 ) = 0{n). 

Proof. We have to show that for every n = d 2 , when the protocol is executed on d 2 nodes constructs 
Gd- From Lemma [2j we have a subroutine that terminates having w.h.p. constructed a dx d square 
with a unique leader on the bottom left node. Next, the leader can easily organize the square into 
a tape of length d? that has d stored in binary in its leftmost cells. Moreover, L is computable, so, 
by Definition [3j there is a TM M that when executed on the pixels of a d x d square constructs 
Sd- The protocol simulates M on the pixels of such a d x d square thus the result is Sd, which is 
an on/off labeled d x d square the on pixels of which form Gd- To perform the simulation, the 
protocol just feeds M with (i , d) = (0, d), (1, d), ..., (d 2 — 1, d), one at a time, simulates M on input 
(i,d) in space 0{d 2 ), marks the corresponding pixel as on or off according to M's decision, and 
moves on to the next input. When i = d 2 , the square contains Gd and the leader releases Gd by one 
of the terminating approaches described above and terminates. Observe that, given the guarantees 
of Lemma [2j the procedure described here is always correct. So, the probability of failure of the 
whole protocol is just the probability of failure of the initial counting subroutine, thus the protocol 
succeeds w.h.p.. Finally, the waste is always equal to the number of pixels of the dx d square that 
are not part of Gd- Observe now that the waste can never be more than {d — 1 )d, because if it 
was at least (d — 1 )d +1 = d 2 — d + 1, then the size of Gd (i.e. the useful space) would be at 
most d 2 — ( d 2 — d + 1) = d — 1. But then, connectivity of Gd implies that maxMimc d < d — 1, 
which contradicts the assumption that max-dimG d = d. Thus, the worst possible waste is indeed 
(d — 1 )d = 0(d 2 ) = 0{n). Notice that here the waste of the protocol is equal to the waste of 
the simulated TM: the protocol just provides the maximum square that fits in the population and 
the TM determines which nodes will be part of the final shape and which will be thrown away as 
waste. □ 


Remark 3. It is worth mentioning that if the system designer knew n in advance, then he/she could 
preprogram the nodes to simulate a TM that constructs a specific shape of size n, for example the 
TM corresponding to the Kolmogorov complexity of the shape (which is in turn the Kolmogorov 
complexity of the desired binary pixel sequence (so, si,..., s n _i)). However, in this work we consider 
systems in which n is not known in advance , so the natural approach is to preprogram the nodes 
with a TM that can work for all n. The protocol must first compute n (w.h.p.) and then simulate 
the TM on input n to construct a shape of the appropriate size. For example, it could be a TM 
constructing a star, as in Figure 7(c), such that the size of the star grows as n grows. 


Remark f. The above results can be immediately modified to refer to patterns instead of shapes. 
In fact, observe that the \jn x yfn square that has been labeled by off and on by the TM is already 
such a (computed) 0/1 pattern. The generic idea to extend this is to keep the same constructor 
as above and simulate TMs that for every pixel output a color from a set of colors C. Then the 
resulting square with its nodes labeled from C is the desired computed pattern and no releasing is 
required in this case. 


6.4 Parallelizing the Simulations 

We now present two approaches for parallelizing the simulations, instead of executing them sequen¬ 
tially one after the other. 
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6.4.1 Approach 1 The first approach uses the 3D model (that is, the one with 6 ports) to 
construct a 2D shape as before. Assume that the shape G has max dime, = d [d could be a 
function of n; e.g. it was O(yfn) in the previous section). We again construct a d x d square as 
before. Assume also that there is a TM M deciding the status of each of the pixels in space k (again 
could be a function of n, e.g. logn or yfn). We assume that both k and d are computable in space 
0(n), so that in the worst case we can create a spanning line, compute them on it by simulating a 
TM, and create seeds of the appropriate lengths as before. Assume also that we have a population 
of size k ■ d 2 . 


When counting of n terminates, the leader computes two different seeds, one of length d and 
another of length k — 1 (this is in contrast to the unique seed of Section 6.2). Then it first activates 
the seed of length d and keeps the other seed in a sleeping state. As before, it constructs a d x d 
square using, say, only dimensions x and y. When this construction completes, the leader wakes 
up the seed of length k — 1 to organize all the remaining nodes into lines of length k — 1 in the 
z dimension. Each of these lines will be attached “below” [^] one of the pixels of the square. The 
pixel and its line form together the required TM-tape of length k corresponding to that pixel (see 
Figure [8]). So, when this process ends, we have a memory of length k attached to each pixel of the 
square. For a simple example, when d = k = yfn, then the construction so far will “look like” a 


n x yfn x yfn cube (actually, if for physical reasons we want to have a more rigid structure, we 
can also activate all edges between the lines; in this case the construction of the previous example 
would be a yfn x yfn x yfn cube). Then the leader initializes each memory to ( i,d ), where i is 
the index of its pixel, the indices being counted by their distance from the “bottom left” corner of 
the square as before, and then informs each pixel-head to start simulating M on its tape. In this 
way, we have d 2 simulations being executed in parallel each on its own tape of length k. When all 
simulations have completed, the leader may first release all memories to keep only the d x d square 
and then apply the releasing process described in the previous section to release the off pixels of 
the square and isolate the connected shape consisting of the pixels that are on. 



Fig. 8. The constructed d x d square lies in dimensions x and y. We can think as its “bottom left” corner, its leftmost 
node in the figure. Every internal intersection point of the square is also a node, but we have not drawn these nodes 
here to improve visibility. “Below” it, in dimension z, are the d 2 lines of length k each. The protocol executes a distinct 
simulation of the TM on each of these lines. In particular, on the line attached to pixel i, for all 0 < i < d 2 — 1, the 
protocol simulates the TM on input (i,d). 


4 It is not actually below; it is in the positive part of the a dimension, but we use “below” to be consistent with the 
more intuitive illustration of Figure [5] 
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Theorem 5. Let L = (Si, S- 2 , ■ • •) be a TM-computable connected 2D shape language, such that Sd 
is computable in space k = f(d) and k is computable in space 0(k ■ d 2 ). Then there is a protocol 
(described above) that w.h.p. constructs C. In particular, for all d > 1, whenever the protocol is 
executed on a population of size n = k ■ d 2 , w.h.p. it constructs Sd and terminates, by executing d 2 
simulations in parallel each with space O(k). In the worst case, when Gd is a line of length d, the 
waste is (d — 1 )d + (k — 1 )d 2 = 0{k ■ d 2 ). 


6.4.2 Approach 2 We now show how to achieve a similar parallelism while avoiding the use of a 
third dimension. Now the unique leader that knows n, instead of constructing a square, constructs 
a spanning line of length d? , say in the x dimension. This line corresponds to a linear expansion 
of the pixels of the d x d square of the previous construction. Moreover, the leader creates a seed 
of length k — 1 as before, to partition the rest of the nodes into lines of length k — 1, this time in 
the y dimension. Each such line will be attached below one of the nodes of the x-line. As before, 
when all y-lines have been attached, the leader initializes their memories with (i,d), where i is the 
index of the corresponding pixel (the index of each pixel is now its distance from the left endpoint 
of the x-line, beginning from 0 and ending at d 2 — 1). Then all simulations of M are executed in 
parallel and eventually each one of them sets its x-pixel to either on or off. When all simulations 
have ended, the leader releases the auxiliary memories (i.e. the y-lines) and then partitions the 
x-line into consecutive segments of length d by placing appropriate marks on the boundary nodes 
(see Figure 9(a)). Each segment corresponds to a row of the d x d square to be constructed. In 
particular, segment i > 1 counting from left corresponds to row i (rows being counted bottom-up). 
Observe that, in the way the pixels have been indexed, segment 2 should match with its upper side 
the upper side of segment 1 (that is segment 2 should rotate 180°), segment 3 should match with 
its lower side the lower side of segment 2, and so on. In general, if i is even, segment i should match 
with its upper side to the upper side of segment i — 1 and, if i is odd, segment i should match with 
its lower side the lower side of segment i — 1. The leader marks appropriately the nodes of each 
segment to make them aware of the orientation that they should have in the square. Moreover, 
it assigns a unique key-marking to each segment so that segment i can easily and locally detect 
segment i — 1. In particular, if i is odd, it marks nodes i and i — 1 of the segment counting from 
left to right (for segment 1 it only marks the leftmost node), and, if i is even, it marks nodes i and 
i — 1 of the segment counting from right to left. In this manner, given that segments respect the 
correct orientation and provided that attachment is only performed when their endpoints match, 
every segment i uniquely matches to segment i — 1 because the first mark of i is uniquely aligned 
with the second mark of i — 1 (see Figure [9(b)[ ). Then the leader releases all segments, one after the 
other, and it remains on the last segment. The segments are free to move in the solution until they 
meet their counterpart, and when this occurs the two segments bind together. Eventually, the dx d 
square is constructed and every pixel is in the correct position (the position corresponding to its 
index counting in a zig zag fashion as in the previous sections). The leader periodically walks on its 
component to detect when it has become equal to the desired square. When this occurs, it initiates 
as before the releasing phase to isolate the final connected shape consisting of the on pixels. 


Remark 5. In all the above constructions the unique leader assumption can be dropped in the price 
of sacrificing termination. In this case, the constructions become stabilizing by the reinitialization 
technique, as e.g. in )MS14j . but should be carefully rewritten. 
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Fig. 9. (a) As in Figure [8] d 2 lines of length k — 1 each, are pendent below the d 2 pixels. The difference now is that 
the pixels have been arranged linearly in dimension x. So, the whole construction is now 2-dimensional. The pixels 
have been partitioned into equal segments of length d each (see the black vertical delimiters). The numbers represent 
the indices of the segments counted from left to right. The arrows leaving above or below the segments, indicate 
which side of the segment should look “downwards” in the square that will be constructed. For example, segment 1 
can remain as it is, while segment 2 has to be rotated so that its upper side attaches to the upper side of segment 
1. Every segment has been marked by a black and a gray node placed at an appropriate position, (b) The segments 
have been released in the solution, and now they have to gather together in order to form the square. Each segment 
knows the correct orientation, i.e. whether it should use its up or down ports, and also it can detect its predecessor 
row by exploiting the marking. In particular, it attaches to a row if its black mark is above the gray mark of the 
other row when their orientation is correct and their endpoints are totally aligned. 



7 Replicating Arbitrary 2D Shapes 

In this section we consider the problem of replicating a given 2-dimensional shape G without using 
a third dimension. G is a connected shape with its nodes labeled on and has a unique leader on 
one of its nodes. The leader does not know G nor its size. All remaining nodes are off and free in 
the solution and we assume that is in any case a sufficient number of them (their actual number 
depends on G and the replication approach). 

7.1 Approach 1 

First a squaring phase is executed during which the shape G is being surrounded with additional 
nodes in order to become included in the smallest rectangle Rq containing it (observe that 
“squaring” is actually a misnomer). As we discuss below, the leader need not control the squaring 
phase as it can be performed in parallel by all nodes of the square by local tests and expansions. 
However, the leader is the one that will detect the successful completion of the squaring phase. To 
achieve this, it only has to move periodically around the connected shape until it detects that it has 
become a rectangle. When this occurs, the squaring phase ends and the leader initiates the shifting 
phase. W.l.o.g. let shifting be performed in terms of columns being shifted in the x dimension 
and let the leader begin from the leftmost column. The leader first copies the configuration of the 
leftmost column (i.e. the state, on or off ., of each node of the column) to separate components in 
the states of the nodes of the second column. The original status of each node, that is whether it 
was initially on or off, is always maintained in a component of its state. The additional components 
store the replica that is being shifted to the right. In general, the leader copies, one after the 
other, the configuration of column i to column i + 1, for every column i of the rectangle R apart 
from the rightmost one. When it reaches the rightmost column, which is the last column to be 
copied, as there is no further column to the right, the leader attaches first free nodes to create 
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a new column and then performs copying as before. This completes the first shifting round. In 
all subsequent shifting rounds, the leader performs shifting always beginning from the rightmost 
column of the replica. Again it first introduces a new column to the right in order to shift the 
rightmost column of the replica and then starts, one after the other, to shift the remaining columns 
to the right. When it completes another shifting round, it moves again to the rightmost column 
of the replica in order to perform the next shifting round in precisely the same way. When a 
shifting rounds ends at the rightmost column of the original rectangle, then the leader knows that 
the whole shape has been totally shifted to an identical rectangle to the right and it stops the 
shifting phase. Then it releases the two rectangles by deactivating the connections between the 
rightmost column of the original and the leftmost column of the replica. After this, one leader 
remains on the original rectangle and the other on the replica. Both execute a de-squaring phase to 
release the dummy nodes that were used to form the rectangles and isolate the two identical shapes. 

Squaring. We give some more details on the squaring process. Recall that we have a given, called 
the original , connected shape G that we want to enclose in the minimal rectangle Rq containing 
it. We claim that squaring can be performed by local detections and actions without the need of a 
unique leader in the shape. 
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Fig. 10. (a) If the connected shape is not yet a rectangle then at least one of these locally detectable shapes must 
exist, (b) An example of an incomplete rectangle, in which all four detection shapes appear. Black nodes and the 
connections between them constitute the original shape G. White nodes and the remaining connections have been 
introduced by the so far execution of the squaring process. 


Proposition 1. At least one of the shapes of Figure 10(a)\ exists in a connected shape G iff G is 
not a rectangle. Such shapes can be used to locally detect that G is not a rectangle yet. 


Proof. Clearly, as long as G is still a proper subgraph of Rq, at least one edge or node (or both) 
is present in Rq but not in G. Observe that if an edge is missing but its endpoint nodes are both 
present and connected by active edges to the rest of the shape, then it is trivial to detect the 
absence of the edge locally: each of the two endpoints knows that it is connected to the shape, thus 
the two nodes just have to activate the edge joining them when they interact over it. So, we can 
w.l.o.g. assume that all possible edges are present between the nodes of G and focus on the case 
that some node is missing. In fact, if some node is missing, then it must hold that at least two rows 
or at least two columns of G have unequal lengths. Let us only consider the row case, as the other is 
symmetric. If there exist two rows in G with unequal lengths then there must necessarily exist two 
consecutive rows of G with unequal lengths (otherwise, if all consecutive rows preserved the length 
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then all rows should have the same length). As G is connected, there must be at least one active 
vertical edge joining the two rows. Then we can begin from that edge and by walking either to the 
left or to the right, we must meet the first position at which the two rows differ. For example, if 
it was to the right, then node u of one row has a right neighbor u r , while the node v above it (or 
below it, respectively) does not. Then it is trivial for the triple (u,u r ,v) to locally detect that v r 
is missing and be ready to attach a free node to the required position when such a node arrives. 
Inversely, if G is a rectangle then none of the above locally detectable shapes can exist. □ 

So, we can have all outer nodes of the segment of Rq that has been constructed so far, to 
locally handle the squaring process without the help of the leader. The leader is only required up 
to this point to detect termination of the squaring phase. In particular, it suffices for the leader 
to detect that its component has become a rectangle, because the above process guarantees that 
when this occurs the constructed rectangle must be equal to Rq- TO achieve this the leader can, 
for example, begin from the leftmost column that it knows and attempt to traverse a rectangle in 
a zig-zag way, e.g. up the first column, then right, then down the second column, and so on. If it 
manages to complete such a traversal without encountering any recesses or overhangs. Finally, it 
is worth mentioning that, in order to execute correctly, the above replication protocol requires a 
population of size at least 2\V{Rg)\ and its waste is 2(\V(Rg)\ — |F(G)|). 

7.2 Approach 2 

Again the original shape is first squared. Then each column is assigned a unique matching identifier 
as in the previous section, so that column i matches uniquely to column i — 1. Then the rightmost 
column is replicated, by attaching free nodes to its right. Replication includes also the unique 
matching key. When replication completes, first the replica-column is released and then the original 
column, so that both move freely in the solution. If desired, we can have replica-columns (or just 
their keys) to use different states than original columns so that no two columns of different kinds 
ever become connected. The process continues with the other columns, each time replicating the 
rightmost one and then releasing both the replica and the original column. Eventually, all columns 
will be replicated and released. Moreover, as original (replica) column i uniquely matches to original 
(replica) column i —1, eventually both the original and the replica rectangles are correctly assembled. 
Finally, a de-squaring phase is executed as before on both rectangles, to release the dummy nodes 
that were used for forming the rectangles and isolate the two identical shapes. 

8 Conclusions and Further Research 

There are several interesting open problems related to the findings of this work. A possible re¬ 
finement of the model could be a distinction between the speed of the scheduler and the internal 
operation speed of a component. For example, a connected component will operate in synchronous 
rounds, where in each round a node observes its neighborhood and its own state and updates its 
state based on what it sees. Nodes can of course update also the state of their local connections 
and we may assume that a connection is formed/dropped if both nodes agree on the decision (an¬ 
other possibility is to allow a link change state if at least one of the nodes say so). This distinction 
between two different “times”, though ignored so far in the literature, is very natural because a 
connected component should operate at a different speed than it takes for the scheduler to bring 
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two nodes (e.g. of different components, or an isolated node and a node of some component) into 
contact. 

It would be also interesting to consider for the first time a hybrid model combining active 
mobility controlled by the protocol and passive mobility controlled by the environment. For example 
it could be a combination of the Nubot model and the model presented in this work. Another very 
intriguing problem is to give a proof, or strong experimental evidence, of Conjecture [TJ If true, 
it would imply that there is no analogue of Theorem [l] if all processes are identical. A possibility 
left open then would be to achieve high probability counting with /(to) leaders. There is also work 
to be done w.r.t. analyzing the running times of our protocols and our generic constructors and 


proposing more efficient solutions. Also it is not yet clear whether the protocol of Section 5.1 is the 
fastest possible nor that its success probability or the upper bound on to that it guarantees cannot 
be improved; a proof would be useful. Moreover, it is not obvious what is the class of shapes and 
patterns that the TMs considered here compute. Of course, it was sufficient as a first step to draw 
the analogy to such TMs because it helped us establish that our model is quite powerful. However, 
still we would like to have a characterization that gives some more insight to the actual shapes and 
patterns that the model can construct. 

It would be also important to develop models (e.g. variations of the one proposed here) that 
take other real physical considerations into account. In this work, we have restricted attention on 
some geometrical constraints. Other properties of interest could be weight, mass, strength of bonds, 
rigid and elastic structure, collisions, and the interplay of these with the interaction pattern and the 
protocol. Moreover, in real applications mere shape construction will not be sufficient. Typically, 
we will desire to output a shape/structure that optimizes some global property, like energy and 
strength, or that achieves a desired behavior in the given physical environment. The latter also 
indicates that the construction and the environment that the construction inhabits cannot be 
studied in isolation. Instead, the two will constantly affect each other, the optimal output will 
highly depend on the optimality that the environment allows and also the environment may highly 
and continuously affect the construction process. The capability of the environment to affect the 
construction process suggests many robustness issues. Imagine an environment that can at any 
given time break an active link with some (small) probability (a similar question was also posed to 
the author during his talk at PODC ’14 by some attendee, which the author would like to thank). 
Under such a perpetual setback no construction can ever stabilize. However, we may still be able 
to have a construction that constantly exists in the population by evolving and self-replicating. 

Finally, in the same spirit, it would be interesting to develop routines that can rapidly 
reconstruct broken parts. For example, imagine that a shape has stabilized but a part of it 
detaches, all the connections of the part become deactivated, and all its nodes become free. Can 
we detect and reconstruct the broken part efficiently (and without resetting the whole population 
and repeating the construction from the beginning)? What knowledge about the whole shape 
should the nodes have to be able to reconstruct missing parts of it? 
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